Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 01/16] netfilter: xt_CT: fix refcnt leak on error path
From: Pablo Neira Ayuso @ 2017-05-03  9:31 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Gao Feng <fgao@ikuai8.com>

There are two cases which causes refcnt leak.

1. When nf_ct_timeout_ext_add failed in xt_ct_set_timeout, it should
free the timeout refcnt.
Now goto the err_put_timeout error handler instead of going ahead.

2. When the time policy is not found, we should call module_put.
Otherwise, the related cthelper module cannot be removed anymore.
It is easy to reproduce by typing the following command:
  # iptables -t raw -A OUTPUT -p tcp -j CT --helper ftp --timeout xxx

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/xt_CT.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c
index b008db0184b8..81fdcdca7457 100644
--- a/net/netfilter/xt_CT.c
+++ b/net/netfilter/xt_CT.c
@@ -167,8 +167,10 @@ xt_ct_set_timeout(struct nf_conn *ct, const struct xt_tgchk_param *par,
 		goto err_put_timeout;
 	}
 	timeout_ext = nf_ct_timeout_ext_add(ct, timeout, GFP_ATOMIC);
-	if (timeout_ext == NULL)
+	if (!timeout_ext) {
 		ret = -ENOMEM;
+		goto err_put_timeout;
+	}
 
 	rcu_read_unlock();
 	return ret;
@@ -200,6 +202,7 @@ static int xt_ct_tg_check(const struct xt_tgchk_param *par,
 			  struct xt_ct_target_info_v1 *info)
 {
 	struct nf_conntrack_zone zone;
+	struct nf_conn_help *help;
 	struct nf_conn *ct;
 	int ret = -EOPNOTSUPP;
 
@@ -248,7 +251,7 @@ static int xt_ct_tg_check(const struct xt_tgchk_param *par,
 	if (info->timeout[0]) {
 		ret = xt_ct_set_timeout(ct, par, info->timeout);
 		if (ret < 0)
-			goto err3;
+			goto err4;
 	}
 	__set_bit(IPS_CONFIRMED_BIT, &ct->status);
 	nf_conntrack_get(&ct->ct_general);
@@ -256,6 +259,10 @@ static int xt_ct_tg_check(const struct xt_tgchk_param *par,
 	info->ct = ct;
 	return 0;
 
+err4:
+	help = nfct_help(ct);
+	if (help)
+		module_put(help->helper->me);
 err3:
 	nf_ct_tmpl_free(ct);
 err2:
-- 
2.1.4

^ permalink raw reply related

* [PATCH 16/16] netfilter: nf_tables: check if same extensions are set when adding elements
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

If no NLM_F_EXCL is set and the element already exists in the set, make
sure that both elements have the same extensions.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 434c739dfeca..11a96e8dd3cd 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3749,6 +3749,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	err = set->ops->insert(ctx->net, set, &elem, &ext2);
 	if (err) {
 		if (err == -EEXIST) {
+			if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) ^
+			    nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) ||
+			    nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^
+			    nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF))
+				return -EBUSY;
 			if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) &&
 			     nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) &&
 			     memcmp(nft_set_ext_data(ext),
-- 
2.1.4


^ permalink raw reply related

* [PATCH 15/16] netfilter: update MAINTAINERS file
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

Several updates on the MAINTAINERS section for Netfilter:

1) Add Florian Westphal, he's been part of the coreteam since October 2012.
   He's been dedicating tireless efforts to improve the Netfilter codebase,
   fix bugs and push ongoing new developments ever since.

2) Add http://www.nftables.org/ URL, currently pointing to
   http://www.netfilter.org.

3) Update project status from Supported to Maintained.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Florian Westphal <fw@strlen.de>
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 38d3e4ed7208..fc95cb06fb29 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8701,14 +8701,16 @@ F:	drivers/net/ethernet/neterion/
 NETFILTER
 M:	Pablo Neira Ayuso <pablo@netfilter.org>
 M:	Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
+M:	Florian Westphal <fw@strlen.de>
 L:	netfilter-devel@vger.kernel.org
 L:	coreteam@netfilter.org
 W:	http://www.netfilter.org/
 W:	http://www.iptables.org/
+W:	http://www.nftables.org/
 Q:	http://patchwork.ozlabs.org/project/netfilter-devel/list/
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git
-S:	Supported
+S:	Maintained
 F:	include/linux/netfilter*
 F:	include/linux/netfilter/
 F:	include/net/netfilter/
-- 
2.1.4


^ permalink raw reply related

* [PATCH 14/16] netfilter: x_tables: unlock on error in xt_find_table_lock()
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Dan Carpenter <dan.carpenter@oracle.com>

According to my static checker we should unlock here before the return.
That seems reasonable to me as well.

Fixes" b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/x_tables.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 14857afc9937..f134d384852f 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1051,8 +1051,10 @@ struct xt_table *xt_find_table_lock(struct net *net, u_int8_t af,
 	list_for_each_entry(t, &init_net.xt.tables[af], list) {
 		if (strcmp(t->name, name))
 			continue;
-		if (!try_module_get(t->me))
+		if (!try_module_get(t->me)) {
+			mutex_unlock(&xt[af].mutex);
 			return NULL;
+		}
 
 		mutex_unlock(&xt[af].mutex);
 		if (t->table_init(net) != 0) {
-- 
2.1.4


^ permalink raw reply related

* [PATCH 13/16] ipvs: explicitly forbid ipv6 service/dest creation if ipv6 mod is disabled
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Paolo Abeni <pabeni@redhat.com>

When creating a new ipvs service, ipv6 addresses are always accepted
if CONFIG_IP_VS_IPV6 is enabled. On dest creation the address family
is not explicitly checked.

This allows the user-space to configure ipvs services even if the
system is booted with ipv6.disable=1. On specific configuration, ipvs
can try to call ipv6 routing code at setup time, causing the kernel to
oops due to fib6_rules_ops being NULL.

This change addresses the issue adding a check for the ipv6
module being enabled while validating ipv6 service operations and
adding the same validation for dest operations.

According to git history, this issue is apparently present since
the introduction of ipv6 support, and the oops can be triggered
since commit 09571c7ae30865ad ("IPVS: Add function to determine
if IPv6 address is local")

Fixes: 09571c7ae30865ad ("IPVS: Add function to determine if IPv6 address is local")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_ctl.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 5aeb0dde6ccc..4d753beaac32 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3078,6 +3078,17 @@ static int ip_vs_genl_dump_services(struct sk_buff *skb,
 	return skb->len;
 }
 
+static bool ip_vs_is_af_valid(int af)
+{
+	if (af == AF_INET)
+		return true;
+#ifdef CONFIG_IP_VS_IPV6
+	if (af == AF_INET6 && ipv6_mod_enabled())
+		return true;
+#endif
+	return false;
+}
+
 static int ip_vs_genl_parse_service(struct netns_ipvs *ipvs,
 				    struct ip_vs_service_user_kern *usvc,
 				    struct nlattr *nla, int full_entry,
@@ -3104,11 +3115,7 @@ static int ip_vs_genl_parse_service(struct netns_ipvs *ipvs,
 	memset(usvc, 0, sizeof(*usvc));
 
 	usvc->af = nla_get_u16(nla_af);
-#ifdef CONFIG_IP_VS_IPV6
-	if (usvc->af != AF_INET && usvc->af != AF_INET6)
-#else
-	if (usvc->af != AF_INET)
-#endif
+	if (!ip_vs_is_af_valid(usvc->af))
 		return -EAFNOSUPPORT;
 
 	if (nla_fwmark) {
@@ -3610,6 +3617,11 @@ static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
 		if (udest.af == 0)
 			udest.af = svc->af;
 
+		if (!ip_vs_is_af_valid(udest.af)) {
+			ret = -EAFNOSUPPORT;
+			goto out;
+		}
+
 		if (udest.af != svc->af && cmd != IPVS_CMD_DEL_DEST) {
 			/* The synchronization protocol is incompatible
 			 * with mixed family services
-- 
2.1.4


^ permalink raw reply related

* [PATCH 12/16] netfilter: Wrong icmp6 checksum for ICMPV6_TIME_EXCEED in reverse SNATv6 path
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Dave Johnson <dave-kernel@centerclick.org>

When recalculating the outer ICMPv6 checksum for a reverse path NATv6
such as ICMPV6_TIME_EXCEED nf_nat_icmpv6_reply_translation() was
accessing data beyond the headlen of the skb for non-linear skb.  This
resulted in incorrect ICMPv6 checksum as garbage data was used.

Patch replaces csum_partial() with skb_checksum() which supports
non-linear skbs similar to nf_nat_icmp_reply_translation() from ipv4.

Signed-off-by: Dave Johnson <dave-kernel@centerclick.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index e0be97e636a4..69937b637ee5 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -235,7 +235,7 @@ int nf_nat_icmpv6_reply_translation(struct sk_buff *skb,
 		inside->icmp6.icmp6_cksum =
 			csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
 					skb->len - hdrlen, IPPROTO_ICMPV6,
-					csum_partial(&inside->icmp6,
+					skb_checksum(skb, hdrlen,
 						     skb->len - hdrlen, 0));
 	}
 
-- 
2.1.4


^ permalink raw reply related

* [PATCH 11/16] netfilter: nft_dynset: continue to next expr if _OP_ADD succeeded
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Liping Zhang <zlpnobody@gmail.com>

Currently, after adding the following nft rules:
  # nft add set x target1 { type ipv4_addr \; flags timeout \;}
  # nft add rule x y set add ip daddr timeout 1d @target1 counter

the counters will always be zero despite of the elements are added
to the dynamic set "target1" or not, as we will break the nft expr
traversal unconditionally:
  # nft list ruleset
  ...
  set target1 {
      ...
      elements = { 8.8.8.8 expires 23h59m53s}
  }
  chain output {
      ...
      set add ip daddr timeout 1d @target1 counter packets 0 bytes 0
                                                           ^       ^
      ...
  }

Since we add the elements to the set successfully, we should continue
to the next expression.

Additionally, if elements are added to "flow table" successfully, we
will _always_ continue to the next expr, even if the operation is
_OP_ADD. So it's better to keep them to be consistent.

Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
Reported-by: Robert White <rwhite@pobox.com>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_dynset.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index 049ad2d9ee66..fafbeea3ed04 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -82,8 +82,7 @@ static void nft_dynset_eval(const struct nft_expr *expr,
 		    nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) {
 			timeout = priv->timeout ? : set->timeout;
 			*nft_set_ext_expiration(ext) = jiffies + timeout;
-		} else if (sexpr == NULL)
-			goto out;
+		}
 
 		if (sexpr != NULL)
 			sexpr->ops->eval(sexpr, regs, pkt);
@@ -92,7 +91,7 @@ static void nft_dynset_eval(const struct nft_expr *expr,
 			regs->verdict.code = NFT_BREAK;
 		return;
 	}
-out:
+
 	if (!priv->invert)
 		regs->verdict.code = NFT_BREAK;
 }
-- 
2.1.4


^ permalink raw reply related

* [PATCH 10/16] bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Linus Lüssing <linus.luessing@c0d3.blue>

When trying to redirect bridged frames to the bridge device itself or
a bridge port (brouting) via the dnat target then this currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge device or port just fine. However, the IP code drops it in
the beginning of ip_input.c/ip_rcv() as the dnat target left
the skb->pkt_type as PACKET_OTHERHOST.

Fixing this by resetting skb->pkt_type to an appropriate type after
dnat'ing.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/bridge/netfilter/ebt_dnat.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/net/bridge/netfilter/ebt_dnat.c b/net/bridge/netfilter/ebt_dnat.c
index 4e0b0c359325..e0bb624c3845 100644
--- a/net/bridge/netfilter/ebt_dnat.c
+++ b/net/bridge/netfilter/ebt_dnat.c
@@ -9,6 +9,7 @@
  */
 #include <linux/module.h>
 #include <net/sock.h>
+#include "../br_private.h"
 #include <linux/netfilter.h>
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter_bridge/ebtables.h>
@@ -18,11 +19,30 @@ static unsigned int
 ebt_dnat_tg(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	const struct ebt_nat_info *info = par->targinfo;
+	struct net_device *dev;
 
 	if (!skb_make_writable(skb, 0))
 		return EBT_DROP;
 
 	ether_addr_copy(eth_hdr(skb)->h_dest, info->mac);
+
+	if (is_multicast_ether_addr(info->mac)) {
+		if (is_broadcast_ether_addr(info->mac))
+			skb->pkt_type = PACKET_BROADCAST;
+		else
+			skb->pkt_type = PACKET_MULTICAST;
+	} else {
+		if (xt_hooknum(par) != NF_BR_BROUTING)
+			dev = br_port_get_rcu(xt_in(par))->br->dev;
+		else
+			dev = xt_in(par);
+
+		if (ether_addr_equal(info->mac, dev->dev_addr))
+			skb->pkt_type = PACKET_HOST;
+		else
+			skb->pkt_type = PACKET_OTHERHOST;
+	}
+
 	return info->target;
 }
 
-- 
2.1.4


^ permalink raw reply related

* [PATCH 07/16] netfilter: ctnetlink: make it safer when updating ct->status
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Liping Zhang <zlpnobody@gmail.com>

After converting to use rcu for conntrack hash, one CPU may update
the ct->status via ctnetlink, while another CPU may process the
packets and update the ct->status.

So the non-atomic operation "ct->status |= status;" via ctnetlink
becomes unsafe, and this may clear the IPS_DYING_BIT bit set by
another CPU unexpectedly. For example:
         CPU0                            CPU1
  ctnetlink_change_status        __nf_conntrack_find_get
      old = ct->status              nf_ct_gc_expired
          -                         nf_ct_kill
          -                      test_and_set_bit(IPS_DYING_BIT
      new = old | status;                 -
  ct->status = new; <-- oops, _DYING_ is cleared!

Now using a series of atomic bit operation to solve the above issue.

Also note, user shouldn't set IPS_TEMPLATE, IPS_SEQ_ADJUST directly,
so make these two bits be unchangable too.

If we set the IPS_TEMPLATE_BIT, ct will be freed by nf_ct_tmpl_free,
but actually it is alloced by nf_conntrack_alloc.
If we set the IPS_SEQ_ADJUST_BIT, this may cause the NULL pointer
deference, as the nfct_seqadj(ct) maybe NULL.

Last, add some comments to describe the logic change due to the
commit a963d710f367 ("netfilter: ctnetlink: Fix regression in CTA_STATUS
processing"), which makes me feel a little confusing.

Fixes: 76507f69c44e ("[NETFILTER]: nf_conntrack: use RCU for conntrack hash")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_conntrack_common.h | 13 ++++++---
 net/netfilter/nf_conntrack_netlink.c               | 33 ++++++++++++++++------
 2 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index 6a8e33dd4ecb..38fc383139f0 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -82,10 +82,6 @@ enum ip_conntrack_status {
 	IPS_DYING_BIT = 9,
 	IPS_DYING = (1 << IPS_DYING_BIT),
 
-	/* Bits that cannot be altered from userland. */
-	IPS_UNCHANGEABLE_MASK = (IPS_NAT_DONE_MASK | IPS_NAT_MASK |
-				 IPS_EXPECTED | IPS_CONFIRMED | IPS_DYING),
-
 	/* Connection has fixed timeout. */
 	IPS_FIXED_TIMEOUT_BIT = 10,
 	IPS_FIXED_TIMEOUT = (1 << IPS_FIXED_TIMEOUT_BIT),
@@ -101,6 +97,15 @@ enum ip_conntrack_status {
 	/* Conntrack got a helper explicitly attached via CT target. */
 	IPS_HELPER_BIT = 13,
 	IPS_HELPER = (1 << IPS_HELPER_BIT),
+
+	/* Be careful here, modifying these bits can make things messy,
+	 * so don't let users modify them directly.
+	 */
+	IPS_UNCHANGEABLE_MASK = (IPS_NAT_DONE_MASK | IPS_NAT_MASK |
+				 IPS_EXPECTED | IPS_CONFIRMED | IPS_DYING |
+				 IPS_SEQ_ADJUST | IPS_TEMPLATE),
+
+	__IPS_MAX_BIT = 14,
 };
 
 /* Connection tracking event types */
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index e5f97777b1f4..86deed6a8db4 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1419,6 +1419,24 @@ ctnetlink_parse_nat_setup(struct nf_conn *ct,
 }
 #endif
 
+static void
+__ctnetlink_change_status(struct nf_conn *ct, unsigned long on,
+			  unsigned long off)
+{
+	unsigned int bit;
+
+	/* Ignore these unchangable bits */
+	on &= ~IPS_UNCHANGEABLE_MASK;
+	off &= ~IPS_UNCHANGEABLE_MASK;
+
+	for (bit = 0; bit < __IPS_MAX_BIT; bit++) {
+		if (on & (1 << bit))
+			set_bit(bit, &ct->status);
+		else if (off & (1 << bit))
+			clear_bit(bit, &ct->status);
+	}
+}
+
 static int
 ctnetlink_change_status(struct nf_conn *ct, const struct nlattr * const cda[])
 {
@@ -1438,10 +1456,7 @@ ctnetlink_change_status(struct nf_conn *ct, const struct nlattr * const cda[])
 		/* ASSURED bit can only be set */
 		return -EBUSY;
 
-	/* Be careful here, modifying NAT bits can screw up things,
-	 * so don't let users modify them directly if they don't pass
-	 * nf_nat_range. */
-	ct->status |= status & ~(IPS_NAT_DONE_MASK | IPS_NAT_MASK);
+	__ctnetlink_change_status(ct, status, 0);
 	return 0;
 }
 
@@ -1628,7 +1643,7 @@ ctnetlink_change_seq_adj(struct nf_conn *ct,
 		if (ret < 0)
 			return ret;
 
-		ct->status |= IPS_SEQ_ADJUST;
+		set_bit(IPS_SEQ_ADJUST_BIT, &ct->status);
 	}
 
 	if (cda[CTA_SEQ_ADJ_REPLY]) {
@@ -1637,7 +1652,7 @@ ctnetlink_change_seq_adj(struct nf_conn *ct,
 		if (ret < 0)
 			return ret;
 
-		ct->status |= IPS_SEQ_ADJUST;
+		set_bit(IPS_SEQ_ADJUST_BIT, &ct->status);
 	}
 
 	return 0;
@@ -2289,10 +2304,10 @@ ctnetlink_update_status(struct nf_conn *ct, const struct nlattr * const cda[])
 	/* This check is less strict than ctnetlink_change_status()
 	 * because callers often flip IPS_EXPECTED bits when sending
 	 * an NFQA_CT attribute to the kernel.  So ignore the
-	 * unchangeable bits but do not error out.
+	 * unchangeable bits but do not error out. Also user programs
+	 * are allowed to clear the bits that they are allowed to change.
 	 */
-	ct->status = (status & ~IPS_UNCHANGEABLE_MASK) |
-		     (ct->status & IPS_UNCHANGEABLE_MASK);
+	__ctnetlink_change_status(ct, status, ~status);
 	return 0;
 }
 
-- 
2.1.4


^ permalink raw reply related

* [PATCH 06/16] netfilter: ctnetlink: fix deadlock due to acquire _expect_lock twice
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Liping Zhang <zlpnobody@gmail.com>

Currently, ctnetlink_change_conntrack is always protected by _expect_lock,
but this will cause a deadlock when deleting the helper from a conntrack,
as the _expect_lock will be acquired again by nf_ct_remove_expectations:

         CPU0
        ----
  lock(nf_conntrack_expect_lock);
  lock(nf_conntrack_expect_lock);

  *** DEADLOCK ***
  May be due to missing lock nesting notation

  2 locks held by lt-conntrack_gr/12853:
  #0:  (&table[i].mutex){+.+.+.}, at: [<ffffffffa05e2009>]
       nfnetlink_rcv_msg+0x399/0x6a9 [nfnetlink]
  #1:  (nf_conntrack_expect_lock){+.....}, at: [<ffffffffa05f2c1f>]
       ctnetlink_new_conntrack+0x17f/0x408 [nf_conntrack_netlink]

  Call Trace:
   dump_stack+0x85/0xc2
   __lock_acquire+0x1608/0x1680
   ? ctnetlink_parse_tuple_proto+0x10f/0x1c0 [nf_conntrack_netlink]
   lock_acquire+0x100/0x1f0
   ? nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   _raw_spin_lock_bh+0x3f/0x50
   ? nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   ctnetlink_change_helper+0xc6/0x190 [nf_conntrack_netlink]
   ctnetlink_new_conntrack+0x1b2/0x408 [nf_conntrack_netlink]
   nfnetlink_rcv_msg+0x60a/0x6a9 [nfnetlink]
   ? nfnetlink_rcv_msg+0x1b9/0x6a9 [nfnetlink]
   ? nfnetlink_bind+0x1a0/0x1a0 [nfnetlink]
   netlink_rcv_skb+0xa4/0xc0
   nfnetlink_rcv+0x87/0x770 [nfnetlink]

Since the operations are unrelated to nf_ct_expect, so we can drop the
_expect_lock. Also note, after removing the _expect_lock protection,
another CPU may invoke nf_conntrack_helper_unregister, so we should
use rcu_read_lock to protect __nf_conntrack_helper_find invoked by
ctnetlink_change_helper.

Fixes: ca7433df3a67 ("netfilter: conntrack: seperate expect locking from nf_conntrack_lock")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_netlink.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 48c184552de0..e5f97777b1f4 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1510,23 +1510,29 @@ static int ctnetlink_change_helper(struct nf_conn *ct,
 		return 0;
 	}
 
+	rcu_read_lock();
 	helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
 					    nf_ct_protonum(ct));
-	if (helper == NULL)
+	if (helper == NULL) {
+		rcu_read_unlock();
 		return -EOPNOTSUPP;
+	}
 
 	if (help) {
 		if (help->helper == helper) {
 			/* update private helper data if allowed. */
 			if (helper->from_nlattr)
 				helper->from_nlattr(helpinfo, ct);
-			return 0;
+			err = 0;
 		} else
-			return -EBUSY;
+			err = -EBUSY;
+	} else {
+		/* we cannot set a helper for an existing conntrack */
+		err = -EOPNOTSUPP;
 	}
 
-	/* we cannot set a helper for an existing conntrack */
-	return -EOPNOTSUPP;
+	rcu_read_unlock();
+	return err;
 }
 
 static int ctnetlink_change_timeout(struct nf_conn *ct,
@@ -1945,9 +1951,7 @@ static int ctnetlink_new_conntrack(struct net *net, struct sock *ctnl,
 	err = -EEXIST;
 	ct = nf_ct_tuplehash_to_ctrack(h);
 	if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
-		spin_lock_bh(&nf_conntrack_expect_lock);
 		err = ctnetlink_change_conntrack(ct, cda);
-		spin_unlock_bh(&nf_conntrack_expect_lock);
 		if (err == 0) {
 			nf_conntrack_eventmask_report((1 << IPCT_REPLY) |
 						      (1 << IPCT_ASSURED) |
@@ -2342,11 +2346,7 @@ ctnetlink_glue_parse(const struct nlattr *attr, struct nf_conn *ct)
 	if (ret < 0)
 		return ret;
 
-	spin_lock_bh(&nf_conntrack_expect_lock);
-	ret = ctnetlink_glue_parse_ct((const struct nlattr **)cda, ct);
-	spin_unlock_bh(&nf_conntrack_expect_lock);
-
-	return ret;
+	return ctnetlink_glue_parse_ct((const struct nlattr **)cda, ct);
 }
 
 static int ctnetlink_glue_exp_parse(const struct nlattr * const *cda,
-- 
2.1.4


^ permalink raw reply related

* [PATCH 05/16] netfilter: ctnetlink: drop the incorrect cthelper module request
From: Pablo Neira Ayuso @ 2017-05-03  9:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Liping Zhang <zlpnobody@gmail.com>

First, when creating a new ct, we will invoke request_module to try to
load the related inkernel cthelper. So there's no need to call the
request_module again when updating the ct helpinfo.

Second, ctnetlink_change_helper may be called with rcu_read_lock held,
i.e. rcu_read_lock -> nfqnl_recv_verdict -> nfqnl_ct_parse ->
ctnetlink_glue_parse -> ctnetlink_glue_parse_ct ->
ctnetlink_change_helper. But the request_module invocation may sleep,
so we can't call it with the rcu_read_lock held.

Remove it now.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_netlink.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index dc7dfd68fafe..48c184552de0 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1512,23 +1512,8 @@ static int ctnetlink_change_helper(struct nf_conn *ct,
 
 	helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
 					    nf_ct_protonum(ct));
-	if (helper == NULL) {
-#ifdef CONFIG_MODULES
-		spin_unlock_bh(&nf_conntrack_expect_lock);
-
-		if (request_module("nfct-helper-%s", helpname) < 0) {
-			spin_lock_bh(&nf_conntrack_expect_lock);
-			return -EOPNOTSUPP;
-		}
-
-		spin_lock_bh(&nf_conntrack_expect_lock);
-		helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
-						    nf_ct_protonum(ct));
-		if (helper)
-			return -EAGAIN;
-#endif
+	if (helper == NULL)
 		return -EOPNOTSUPP;
-	}
 
 	if (help) {
 		if (help->helper == helper) {
-- 
2.1.4


^ permalink raw reply related

* [PATCH 03/16] netfilter: nf_ct_helper: permit cthelpers with different names via nfnetlink
From: Pablo Neira Ayuso @ 2017-05-03  9:31 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Liping Zhang <zlpnobody@gmail.com>

cthelpers added via nfnetlink may have the same tuple, i.e. except for
the l3proto and l4proto, other fields are all zero. So even with the
different names, we will also fail to add them:
  # nfct helper add ssdp inet udp
  # nfct helper add tftp inet udp
  nfct v1.4.3: netlink error: File exists

So in order to avoid unpredictable behaviour, we should:
1. cthelpers can be selected by nft ct helper obj or xt_CT target, so
report error if duplicated { name, l3proto, l4proto } tuple exist.
2. cthelpers can be selected by nf_ct_tuple_src_mask_cmp when
nf_ct_auto_assign_helper is enabled, so also report error if duplicated
{ l3proto, l4proto, src-port } tuple exist.

Also note, if the cthelper is added from userspace, then the src-port will
always be zero, it's invalid for nf_ct_auto_assign_helper, so there's no
need to check the second point listed above.

Fixes: 893e093c786c ("netfilter: nf_ct_helper: bail out on duplicated helpers")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_helper.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 4eeb3418366a..99bcd44aac70 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -386,17 +386,33 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
 	struct nf_conntrack_tuple_mask mask = { .src.u.all = htons(0xFFFF) };
 	unsigned int h = helper_hash(&me->tuple);
 	struct nf_conntrack_helper *cur;
-	int ret = 0;
+	int ret = 0, i;
 
 	BUG_ON(me->expect_policy == NULL);
 	BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
 	BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
 
 	mutex_lock(&nf_ct_helper_mutex);
-	hlist_for_each_entry(cur, &nf_ct_helper_hash[h], hnode) {
-		if (nf_ct_tuple_src_mask_cmp(&cur->tuple, &me->tuple, &mask)) {
-			ret = -EEXIST;
-			goto out;
+	for (i = 0; i < nf_ct_helper_hsize; i++) {
+		hlist_for_each_entry(cur, &nf_ct_helper_hash[i], hnode) {
+			if (!strcmp(cur->name, me->name) &&
+			    (cur->tuple.src.l3num == NFPROTO_UNSPEC ||
+			     cur->tuple.src.l3num == me->tuple.src.l3num) &&
+			    cur->tuple.dst.protonum == me->tuple.dst.protonum) {
+				ret = -EEXIST;
+				goto out;
+			}
+		}
+	}
+
+	/* avoid unpredictable behaviour for auto_assign_helper */
+	if (!(me->flags & NF_CT_HELPER_F_USERSPACE)) {
+		hlist_for_each_entry(cur, &nf_ct_helper_hash[h], hnode) {
+			if (nf_ct_tuple_src_mask_cmp(&cur->tuple, &me->tuple,
+						     &mask)) {
+				ret = -EEXIST;
+				goto out;
+			}
 		}
 	}
 	hlist_add_head_rcu(&me->hnode, &nf_ct_helper_hash[h]);
-- 
2.1.4


^ permalink raw reply related

* [PATCH 02/16] openvswitch: Delete conntrack entry clashing with an expectation.
From: Pablo Neira Ayuso @ 2017-05-03  9:31 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493803931-2837-1-git-send-email-pablo@netfilter.org>

From: Jarno Rajahalme <jarno@ovn.org>

Conntrack helpers do not check for a potentially clashing conntrack
entry when creating a new expectation.  Also, nf_conntrack_in() will
check expectations (via init_conntrack()) only if a conntrack entry
can not be found.  The expectation for a packet which also matches an
existing conntrack entry will not be removed by conntrack, and is
currently handled inconsistently by OVS, as OVS expects the
expectation to be removed when the connection tracking entry matching
that expectation is confirmed.

It should be noted that normally an IP stack would not allow reuse of
a 5-tuple of an old (possibly lingering) connection for a new data
connection, so this is somewhat unlikely corner case.  However, it is
possible that a misbehaving source could cause conntrack entries be
created that could then interfere with new related connections.

Fix this in the OVS module by deleting the clashing conntrack entry
after an expectation has been matched.  This causes the following
nf_conntrack_in() call also find the expectation and remove it when
creating the new conntrack entry, as well as the forthcoming reply
direction packets to match the new related connection instead of the
old clashing conntrack entry.

Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Yang Song <yangsong@vmware.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/openvswitch/conntrack.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 7b2c2fce408a..c92a5795dcda 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -514,10 +514,38 @@ ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone *zone,
 		   u16 proto, const struct sk_buff *skb)
 {
 	struct nf_conntrack_tuple tuple;
+	struct nf_conntrack_expect *exp;

 	if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, &tuple))
 		return NULL;
-	return __nf_ct_expect_find(net, zone, &tuple);
+
+	exp = __nf_ct_expect_find(net, zone, &tuple);
+	if (exp) {
+		struct nf_conntrack_tuple_hash *h;
+
+		/* Delete existing conntrack entry, if it clashes with the
+		 * expectation.  This can happen since conntrack ALGs do not
+		 * check for clashes between (new) expectations and existing
+		 * conntrack entries.  nf_conntrack_in() will check the
+		 * expectations only if a conntrack entry can not be found,
+		 * which can lead to OVS finding the expectation (here) in the
+		 * init direction, but which will not be removed by the
+		 * nf_conntrack_in() call, if a matching conntrack entry is
+		 * found instead.  In this case all init direction packets
+		 * would be reported as new related packets, while reply
+		 * direction packets would be reported as un-related
+		 * established packets.
+		 */
+		h = nf_conntrack_find_get(net, zone, &tuple);
+		if (h) {
+			struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+			nf_ct_delete(ct, 0, 0);
+			nf_conntrack_put(&ct->ct_general);
+		}
+	}
+
+	return exp;
 }

 /* This replicates logic from nf_conntrack_core.c that is not exported. */
-- 
2.1.4

^ permalink raw reply related

* [PATCH 00/16] Netfilter/IPVS/OVS fixes for net
From: Pablo Neira Ayuso @ 2017-05-03  9:31 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains a rather large batch of Netfilter, IPVS
and OVS fixes for your net tree. This includes fixes for ctnetlink, the
userspace conntrack helper infrastructure, conntrack OVS support,
ebtables DNAT target, several leaks in error path among other. More
specifically, they are:

1) Fix reference count leak in the CT target error path, from Gao Feng.

2) Remove conntrack entry clashing with a matching expectation, patch
   from Jarno Rajahalme.

3) Fix bogus EEXIST when registering two different userspace helpers,
   from Liping Zhang.

4) Don't leak dummy elements in the new bitmap set type in nf_tables,
   from Liping Zhang.

5) Get rid of module autoload from conntrack update path in ctnetlink,
   we don't need autoload at this late stage and it is happening with
   rcu read lock held which is not good. From Liping Zhang.

6) Fix deadlock due to double-acquire of the expect_lock from conntrack
   update path, this fixes a bug that was introduced when the central
   spinlock got removed. Again from Liping Zhang.

7) Safe ct->status update from ctnetlink path, from Liping. The expect_lock
   protection that was selected when the central spinlock was removed was
   not really protecting anything at all.

8) Protect sequence adjustment under ct->lock.

9) Missing socket match with IPv6, from Peter Tirsek.

10) Adjust skb->pkt_type of DNAT'ed frames from ebtables, from
    Linus Luessing.

11) Don't give up on evaluating the expression on new entries added via
    dynset expression in nf_tables, from Liping Zhang.

12) Use skb_checksum() when mangling icmpv6 in IPv6 NAT as this deals
    with non-linear skbuffs.

13) Don't allow IPv6 service in IPVS if no IPv6 support is available,
    from Paolo Abeni.

14) Missing mutex release in error path of xt_find_table_lock(), from
    Dan Carpenter.

15) Update maintainers files, Netfilter section. Add Florian to the
    file, refer to nftables.org and change project status from Supported
    to Maintained.

16) Bail out on mismatching extensions in element updates in nf_tables.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 94836ecf1e7378b64d37624fbb81fe48fbd4c772:

  Merge tag 'nfsd-4.11-2' of git://linux-nfs.org/~bfields/linux (2017-04-21 16:37:48 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 9744a6fcefcb4d56501d69adb04c24559d353cad:

  netfilter: nf_tables: check if same extensions are set when adding elements (2017-05-03 10:58:00 +0200)

----------------------------------------------------------------
Dan Carpenter (1):
      netfilter: x_tables: unlock on error in xt_find_table_lock()

Dave Johnson (1):
      netfilter: Wrong icmp6 checksum for ICMPV6_TIME_EXCEED in reverse SNATv6 path

Gao Feng (1):
      netfilter: xt_CT: fix refcnt leak on error path

Jarno Rajahalme (1):
      openvswitch: Delete conntrack entry clashing with an expectation.

Linus Lüssing (1):
      bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port

Liping Zhang (7):
      netfilter: nf_ct_helper: permit cthelpers with different names via nfnetlink
      netfilter: nft_set_bitmap: free dummy elements when destroy the set
      netfilter: ctnetlink: drop the incorrect cthelper module request
      netfilter: ctnetlink: fix deadlock due to acquire _expect_lock twice
      netfilter: ctnetlink: make it safer when updating ct->status
      netfilter: ctnetlink: acquire ct->lock before operating nf_ct_seqadj
      netfilter: nft_dynset: continue to next expr if _OP_ADD succeeded

Pablo Neira Ayuso (3):
      Merge tag 'ipvs-fixes-for-v4.11' of http://git.kernel.org/.../horms/ipvs
      netfilter: update MAINTAINERS file
      netfilter: nf_tables: check if same extensions are set when adding elements

Paolo Abeni (1):
      ipvs: explicitly forbid ipv6 service/dest creation if ipv6 mod is disabled

Peter Tirsek (1):
      netfilter: xt_socket: Fix broken IPv6 handling

 MAINTAINERS                                        |  4 +-
 include/uapi/linux/netfilter/nf_conntrack_common.h | 13 +++-
 net/bridge/netfilter/ebt_dnat.c                    | 20 +++++
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c           |  2 +-
 net/netfilter/ipvs/ip_vs_ctl.c                     | 22 ++++--
 net/netfilter/nf_conntrack_helper.c                | 26 +++++--
 net/netfilter/nf_conntrack_netlink.c               | 89 ++++++++++++----------
 net/netfilter/nf_tables_api.c                      |  5 ++
 net/netfilter/nft_dynset.c                         |  5 +-
 net/netfilter/nft_set_bitmap.c                     |  5 ++
 net/netfilter/x_tables.c                           |  4 +-
 net/netfilter/xt_CT.c                              | 11 ++-
 net/netfilter/xt_socket.c                          |  2 +-
 net/openvswitch/conntrack.c                        | 30 +++++++-
 14 files changed, 174 insertions(+), 64 deletions(-)

^ permalink raw reply

* RE: [PATCH] Fix for new version of realtek r8153
From: Hayes Wang @ 2017-05-03  9:18 UTC (permalink / raw)
  To: jake Briggs, mario_limonciello-8PEkshWhKlo@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: jake, nic_swsd
In-Reply-To: <20170502232048.9153-1-nexussix-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

jake Briggs [mailto:nexussix-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org]
> Sent: Wednesday, May 03, 2017 7:21 AM
[...]
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index 07f788c49d57..2a55459fdfac 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -4277,6 +4277,7 @@ static void r8152b_get_version(struct r8152 *tp)
>  		tp->mii.supports_gmii = 1;
>  		break;
>  	case 0x5c30:
> +	case 0x6010:

The two chips are different. I don't think it is a good idea.
Maybe you could use the driver from the Realtek website first.

>  		tp->version = RTL_VER_06;
>  		tp->mii.supports_gmii = 1;
>  		break;
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [iproute PATCH] man: ip.8: Document -brief flag
From: Phil Sutter @ 2017-05-03  9:07 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Brief output is especially useful for new users, so at least mention
it's existence in ip man page.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 man/man8/ip.8 | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 1c5a7419e4fc2..ae018fdf11ac9 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -48,7 +48,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
 \fB\-ts\fR[\fIhort\fR] |
 \fB\-n\fR[\fIetns\fR] name |
 \fB\-a\fR[\fIll\fR] |
-\fB\-c\fR[\fIolor\fR] }
+\fB\-c\fR[\fIolor\fR]
+\fB\-br\fR[\fIief\fR] }
 
 
 .SH OPTIONS
@@ -206,6 +207,11 @@ Set the netlink socket receive buffer size, defaults to 1MB.
 .BR "\-iec"
 print human readable rates in IEC units (e.g. 1Ki = 1024).
 
+.TP
+.BR "\-br" , "\-brief"
+Print only basic information in a tabular format for better readability. This option is currently only supported by
+.BR "ip addr show " and " ip link show " commands.
+
 .SH IP - COMMAND SYNTAX
 
 .SS
-- 
2.11.0

^ permalink raw reply related

* [PATCH RESEND 4.4-only] netlink: Allow direct reclaim for fallback allocation
From: Ross Lagerwall @ 2017-05-03  8:44 UTC (permalink / raw)
  To: stable
  Cc: Ross Lagerwall, David S. Miller, Greg Kroah-Hartman, Eric Dumazet,
	netdev, linux-kernel

The backport of d35c99ff77ec ("netlink: do not enter direct reclaim from
netlink_dump()") to the 4.4 branch (first in 4.4.32) mistakenly removed
direct claim from the initial large allocation _and_ the fallback
allocation which means that allocations can spuriously fail.
Fix the issue by adding back the direct reclaim flag to the fallback
allocation.

Fixes: 6d123f1d396b ("netlink: do not enter direct reclaim from netlink_dump()")
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
---

Note that this is only for the 4.4 branch as the regression is only in
this branch. Consequently, there is no corresponding upstream commit.

I'm resending this to the linux-stable list since I now understand the
netdev maintainer only handles backports for the last couple of versions
of Linux.

 net/netlink/af_netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 8e33019..acfb16f 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2107,7 +2107,7 @@ static int netlink_dump(struct sock *sk)
 	if (!skb) {
 		alloc_size = alloc_min_size;
 		skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
-					(GFP_KERNEL & ~__GFP_DIRECT_RECLAIM));
+					GFP_KERNEL);
 	}
 	if (!skb)
 		goto errout_skb;
-- 
2.7.4

^ permalink raw reply related

* [PATCH 1/1] net: usb: qmi_wwan: add Telit ME910 support
From: Daniele Palmas @ 2017-05-03  8:30 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: netdev, Daniele Palmas

This patch adds support for Telit ME910 PID 0x1100.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
---

0x1100 composition is:

tty + qdss + tty + rmnet

Following lsusb output:

Bus 003 Device 018: ID 1bc7:1100 Telit Wireless Solutions 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1bc7 Telit Wireless Solutions
  idProduct          0x1100 
  bcdDevice            0.00
  iManufacturer           3 Telit
  iProduct                2 Telit ME910
  iSerial                 4 1f5fec
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength          108
    bNumInterfaces          4
    bConfigurationValue     1
    iConfiguration          1 Telit Configuration
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        2
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x84  EP 4 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        3
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x85  EP 5 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x86  EP 6 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)

---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index a3ed811..d716576 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1201,6 +1201,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x2357, 0x0201, 4)},	/* TP-LINK HSUPA Modem MA180 */
 	{QMI_FIXED_INTF(0x2357, 0x9000, 4)},	/* TP-LINK MA260 */
 	{QMI_QUIRK_SET_DTR(0x1bc7, 0x1040, 2)},	/* Telit LE922A */
+	{QMI_FIXED_INTF(0x1bc7, 0x1100, 3)},	/* Telit ME910 */
 	{QMI_FIXED_INTF(0x1bc7, 0x1200, 5)},	/* Telit LE920 */
 	{QMI_QUIRK_SET_DTR(0x1bc7, 0x1201, 2)},	/* Telit LE920, LE920A4 */
 	{QMI_FIXED_INTF(0x1c9e, 0x9b01, 3)},	/* XS Stick W100-2 from 4G Systems */
-- 
2.7.4

^ permalink raw reply related

* Re: ipsec doesn't route TCP with 4.11 kernel
From: Steffen Klassert @ 2017-05-03  8:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Don Bowman, Cong Wang, linux-kernel@vger.kernel.org, Herbert Xu,
	Linux Kernel Network Developers
In-Reply-To: <1493398002.31837.12.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, Apr 28, 2017 at 09:46:42AM -0700, Eric Dumazet wrote:
> On Fri, 2017-04-28 at 09:13 +0200, Steffen Klassert wrote:
> >          encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
> > 
> > Ok, this is espinudp. This information was important.
> 
> > This is not a GRO issue as I thought, the TX side is already broken.
> > 
> > Could you please try the patch below?
> > 
> > Subject: [PATCH] esp4: Fix udpencap for local TCP packets.
> > 
> > Locally generated TCP packets are usually cloned, so we
> > do skb_cow_data() on this packets. After that we need to
> > reload the pointer to the esp header. On udpencap this
> > header has an offset to skb_transport_header, so take this
> > offset into account.
> 
> 
> It looks like locally generated TCP packets could avoid the
> skb_cow_data(), if you were using skb_header_cloned() instead of
> skb_cloned()  ?

Yes, should be possible in the codepath where we do crypto
with separate src and dst buffers. Would require some
rearrangements to make sure we don't do inplace crypto
in this case.

Thanks for the hint!

^ permalink raw reply

* Re: ipsec doesn't route TCP with 4.11 kernel
From: Steffen Klassert @ 2017-05-03  8:14 UTC (permalink / raw)
  To: Don Bowman
  Cc: Cong Wang, linux-kernel@vger.kernel.org, Herbert Xu,
	Linux Kernel Network Developers
In-Reply-To: <CADJev7_Tc0aRsPs0Q7Wijd-YBM39ZshitJpSo2yEqPVwag2X_Q@mail.gmail.com>

On Sat, Apr 29, 2017 at 08:39:34PM -0400, Don Bowman wrote:
> On 28 April 2017 at 03:13, Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> > On Thu, Apr 27, 2017 at 06:13:38PM -0400, Don Bowman wrote:
> >> On 27 April 2017 at 04:42, Steffen Klassert <steffen.klassert@secunet.com>
> >> wrote:
> >> > On Wed, Apr 26, 2017 at 10:01:34PM -0700, Cong Wang wrote:
> >> >> (Cc'ing netdev and IPSec maintainers)
> >> >>
> >> >> On Tue, Apr 25, 2017 at 6:08 PM, Don Bowman <db@donbowman.ca> wrote:
> >>
> 
> <snip>
> 
> confirmed, with this patch in place that the tcp functions properly.

Thanks for testing!

I'll make sure to get this fix into the mainline soon.

^ permalink raw reply

* Re: [PATCH] net: ethernet: stmmac: properly set PS bit in MII configurations during reset
From: Giuseppe CAVALLARO @ 2017-05-03  8:13 UTC (permalink / raw)
  To: Thomas Petazzoni, Alexandre Torgue; +Cc: netdev, stable
In-Reply-To: <1493286329-24448-1-git-send-email-thomas.petazzoni@free-electrons.com>

Hello Thomas

this was initially set by using the hw->link.port; both the core_init 
and adjust callback
should invoke the hook and tuning the PS bit according to the speed and 
mode.
So maybe the ->set_ps is superfluous and you could reuse the existent hook

let me know

Regards
peppe

On 4/27/2017 11:45 AM, Thomas Petazzoni wrote:
> On the SPEAr600 SoC, which has the dwmac1000 variant of the IP block,
> the DMA reset never succeeds when a MII PHY is used (no problem with a
> GMII PHY). The dwmac_dma_reset() function sets the
> DMA_BUS_MODE_SFT_RESET bit in the DMA_BUS_MODE register, and then
> polls until this bit clears. When a MII PHY is used, with the current
> driver, this bit never clears and the driver therefore doesn't work.
>
> The reason is that the PS bit of the GMAC_CONTROL register should be
> correctly configured for the DMA reset to work. When the PS bit is 0,
> it tells the MAC we have a GMII PHY, when the PS bit is 1, it tells
> the MAC we have a MII PHY.
>
> Doing a DMA reset clears all registers, so the PS bit is cleared as
> well. This makes the DMA reset work fine with a GMII PHY. However,
> with MII PHY, the PS bit should be set.
>
> We have identified this issue thanks to two SPEAr600 platform:
>
>   - One equipped with a GMII PHY, with which the existing driver was
>     working fine.
>
>   - One equipped with a MII PHY, where the current driver fails because
>     the DMA reset times out.
>
> This patch fixes the problem for the MII PHY configuration, and has
> been tested with a GMII PHY configuration as well.
>
> In terms of implement, since the ->reset() hook is implemented in the
> DMA related code, we do not want to touch directly from this function
> the MAC registers. Therefore, a ->set_ps() hook has been added to
> stmmac_ops, which gets called between the moment the reset is asserted
> and the polling loop waiting for the reset bit to clear.
>
> In order for this ->set_ps() hook to decide what to do, we pass it the
> "struct mac_device_info" so it can access the MAC registers, and the
> PHY interface type so it knows if we're using a MII PHY or not.
>
> The ->set_ps() hook is only implemented for the dwmac1000 case.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> Cc: <stable@vger.kernel.org>
> ---
> Do not hesitate to suggest ideas for alternative implementations, I'm
> not sure if the current proposal is the one that fits best with the
> current design of the driver.
> ---
>   drivers/net/ethernet/stmicro/stmmac/common.h         | 12 +++++++++---
>   drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 16 ++++++++++++++++
>   drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h     |  3 ++-
>   drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c     |  7 ++++++-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h      |  3 ++-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c      |  6 +++++-
>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c    |  3 ++-
>   7 files changed, 42 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> index 04d9245..d576f95 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> @@ -407,10 +407,13 @@ struct stmmac_desc_ops {
>   extern const struct stmmac_desc_ops enh_desc_ops;
>   extern const struct stmmac_desc_ops ndesc_ops;
>   
> +struct mac_device_info;
> +
>   /* Specific DMA helpers */
>   struct stmmac_dma_ops {
>   	/* DMA core initialization */
> -	int (*reset)(void __iomem *ioaddr);
> +	int (*reset)(void __iomem *ioaddr, struct mac_device_info *hw,
> +		     phy_interface_t interface);
>   	void (*init)(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg,
>   		     u32 dma_tx, u32 dma_rx, int atds);
>   	/* Configure the AXI Bus Mode Register */
> @@ -445,12 +448,15 @@ struct stmmac_dma_ops {
>   	void (*enable_tso)(void __iomem *ioaddr, bool en, u32 chan);
>   };
>   
> -struct mac_device_info;
> -
>   /* Helpers to program the MAC core */
>   struct stmmac_ops {
>   	/* MAC core initialization */
>   	void (*core_init)(struct mac_device_info *hw, int mtu);
> +	/* Set port select. Called between asserting DMA reset and
> +	 * waiting for the reset bit to clear.
> +	 */
> +	void (*set_ps)(struct mac_device_info *hw,
> +		       phy_interface_t interface);
>   	/* Enable and verify that the IPC module is supported */
>   	int (*rx_ipc)(struct mac_device_info *hw);
>   	/* Enable RX Queues */
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> index 19b9b308..dfcbb5b 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> @@ -75,6 +75,21 @@ static void dwmac1000_core_init(struct mac_device_info *hw, int mtu)
>   #endif
>   }
>   
> +static void dwmac1000_set_ps(struct mac_device_info *hw,
> +			     phy_interface_t interface)
> +{
> +	void __iomem *ioaddr = hw->pcsr;
> +	u32 value = readl(ioaddr + GMAC_CONTROL);
> +
> +	/* When a MII PHY is used, we must set the PS bit for the DMA
> +	 * reset to succeed.
> +	 */
> +	if (interface == PHY_INTERFACE_MODE_MII)
> +		value |= GMAC_CONTROL_PS;
> +
> +	writel(value, ioaddr + GMAC_CONTROL);
> +}
> +
>   static int dwmac1000_rx_ipc_enable(struct mac_device_info *hw)
>   {
>   	void __iomem *ioaddr = hw->pcsr;
> @@ -488,6 +503,7 @@ static void dwmac1000_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x)
>   
>   static const struct stmmac_ops dwmac1000_ops = {
>   	.core_init = dwmac1000_core_init,
> +	.set_ps = dwmac1000_set_ps,
>   	.rx_ipc = dwmac1000_rx_ipc_enable,
>   	.dump_regs = dwmac1000_dump_regs,
>   	.host_irq_status = dwmac1000_irq_status,
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> index 1b06df7..e9c6c49 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> @@ -183,7 +183,8 @@
>   #define DMA_CHAN0_DBG_STAT_RPS		GENMASK(11, 8)
>   #define DMA_CHAN0_DBG_STAT_RPS_SHIFT	8
>   
> -int dwmac4_dma_reset(void __iomem *ioaddr);
> +int dwmac4_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		     phy_interface_t interface);
>   void dwmac4_enable_dma_transmission(void __iomem *ioaddr, u32 tail_ptr);
>   void dwmac4_enable_dma_irq(void __iomem *ioaddr);
>   void dwmac410_enable_dma_irq(void __iomem *ioaddr);
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
> index c7326d5..485eecb 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
> @@ -14,7 +14,8 @@
>   #include "dwmac4_dma.h"
>   #include "dwmac4.h"
>   
> -int dwmac4_dma_reset(void __iomem *ioaddr)
> +int dwmac4_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		     phy_interface_t interface)
>   {
>   	u32 value = readl(ioaddr + DMA_BUS_MODE);
>   	int limit;
> @@ -22,6 +23,10 @@ int dwmac4_dma_reset(void __iomem *ioaddr)
>   	/* DMA SW reset */
>   	value |= DMA_BUS_MODE_SFT_RESET;
>   	writel(value, ioaddr + DMA_BUS_MODE);
> +
> +	if (hw->mac->set_ps)
> +		hw->mac->set_ps(hw, interface);
> +
>   	limit = 10;
>   	while (limit--) {
>   		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
> index 56e485f..25ae028 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
> @@ -144,6 +144,7 @@ void dwmac_dma_stop_tx(void __iomem *ioaddr);
>   void dwmac_dma_start_rx(void __iomem *ioaddr);
>   void dwmac_dma_stop_rx(void __iomem *ioaddr);
>   int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x);
> -int dwmac_dma_reset(void __iomem *ioaddr);
> +int dwmac_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		    phy_interface_t interface);
>   
>   #endif /* __DWMAC_DMA_H__ */
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
> index e60bfca..1a17df5 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
> @@ -23,7 +23,8 @@
>   
>   #define GMAC_HI_REG_AE		0x80000000
>   
> -int dwmac_dma_reset(void __iomem *ioaddr)
> +int dwmac_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		    phy_interface_t interface)
>   {
>   	u32 value = readl(ioaddr + DMA_BUS_MODE);
>   	int err;
> @@ -32,6 +33,9 @@ int dwmac_dma_reset(void __iomem *ioaddr)
>   	value |= DMA_BUS_MODE_SFT_RESET;
>   	writel(value, ioaddr + DMA_BUS_MODE);
>   
> +	if (hw->mac->set_ps)
> +		hw->mac->set_ps(hw, interface);
> +
>   	err = readl_poll_timeout(ioaddr + DMA_BUS_MODE, value,
>   				 !(value & DMA_BUS_MODE_SFT_RESET),
>   				 100000, 10000);
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 4498a38..66bc218 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1585,7 +1585,8 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
>   	if (priv->extend_desc && (priv->mode == STMMAC_RING_MODE))
>   		atds = 1;
>   
> -	ret = priv->hw->dma->reset(priv->ioaddr);
> +	ret = priv->hw->dma->reset(priv->ioaddr, priv->hw,
> +				   priv->plat->interface);
>   	if (ret) {
>   		dev_err(priv->device, "Failed to reset the dma\n");
>   		return ret;

^ permalink raw reply

* Re: [net-next PATCH 1/4] samples/bpf: adjust rlimit RLIMIT_MEMLOCK for traceex2, tracex3 and tracex4
From: Jesper Dangaard Brouer @ 2017-05-03  8:12 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: kafai, netdev, eric, Daniel Borkmann, brouer
In-Reply-To: <20170503005314.7oovr764r3e4elzd@ast-mbp>

On Tue, 2 May 2017 17:53:16 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Tue, May 02, 2017 at 02:31:50PM +0200, Jesper Dangaard Brouer wrote:
> > Needed to adjust max locked memory RLIMIT_MEMLOCK for testing these bpf samples
> > as these are using more and larger maps than can fit in distro default 64Kbytes limit.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> ...
> > +	struct rlimit r = {1024*1024, RLIM_INFINITY};  
> ...
> > +	struct rlimit r = {1024*1024, RLIM_INFINITY};  
> 
> why magic numbers?
> All other samples do
> struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};

I just wanted to provide some examples showing that it is possible to
set some reasonable limit.

The RLIM_INFINITY setting is basically just disabling the kernels
memory limit checks, and it is sort of a bad coding pattern (that
people will copy) as the two example programs does not need much.

> > +	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
> > +		perror("setrlimit(RLIMIT_MEMLOCK)");  
> 
> ip_tunnel.c test does:
> perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");
> Few others do:
> assert(!setrlimit(RLIMIT_MEMLOCK, &r));
> and the rest just:
> setrlimit(RLIMIT_MEMLOCK, &r);
> 
> We probalby need to move this to a helper.
> 
> > +	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};  
> 
> here it's consistent :)
> 
> > +	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
> > +		perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");  
> 
> but with different perror ?
> Let's do a common helper for all?

Sure, it makes sense to streamline this into a helper, just not in this
patchset ;-)  Lets do that later...

And I would argue that this helper should allow users to specify some
expected/reasonable memory usage size, as the kernel side checks would
then provide some value, instead of being effectively disabled.  I can
easily imagine someone increasing a _kern.c hash map max size to
100 million, without realizing that this can OOM the machine.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH] brcmfmac: btcoex: replace init_timer with setup_timer
From: Arend van Spriel @ 2017-05-03  8:05 UTC (permalink / raw)
  To: Xie Qirong, Franky Lin, Hante Meuleman, Kalle Valo
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Piotr Haber
In-Reply-To: <20170503073555.3922-1-cheerx1994-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On 5/3/2017 9:35 AM, Xie Qirong wrote:
> Signed-off-by: Xie Qirong <cheerx1994-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> 
>   setup_timer.cocci suggested the following improvement:
>   drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c:383:1-11: Use
>   setup_timer function for function on line 384.

Move the text above before your sign-off so it will end up in the git 
commit message.

When done you may also add my acknowledgement, ie.:

Acked-by: Arend van Spriel <arend.vanspriel-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Regards,
Arend

^ permalink raw reply

* Re: [PATCH net-next v2] net: ipv6: make sure multicast packets are not forwarded beyond the different scopes
From: Donatas Abraitis @ 2017-05-03  7:53 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, stable
In-Reply-To: <20170502.145923.66844914584656456.davem@davemloft.net>

Looks like there is this test already:

                if (IPV6_ADDR_MC_SCOPE(&ipv6_hdr(skb)->daddr) <=
                    IPV6_ADDR_SCOPE_NODELOCAL &&
                    !(dev->flags & IFF_LOOPBACK)) {
                        kfree_skb(skb);
                        return 0;
                }

On Tue, May 2, 2017 at 9:59 PM, David Miller <davem@davemloft.net> wrote:
> From: Donatas Abraitis <donatas.abraitis@gmail.com>
> Date: Thu, 27 Apr 2017 10:12:02 +0300
>
>>           RFC4291 2.7 Routers must not forward any multicast packets
>>           beyond of the scope indicated by the scop field in the
>>           destination multicast address.
>>
>> Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
>
> I think it's a ">=" test which is needed here, not pure equality.
> Scopes are subsets of other scopes and are therefore allowed within
> eachother.
>
> Did you actually see misbehavior due to this issue, or see a real
> bonafide conformance test fail?
>
> If you're just reading the RFC and sticking tests here and there based
> upon what you read, without any testing or real life verification of
> the issue, this is _strongly_ discouraged.
>
> It would even be ok if you merely showed how another open source
> networking stack makes this test.



-- 
Donatas

^ permalink raw reply

* [PATCH net] tg3: don't clear stats while tg3_close
From: YueHaibing @ 2017-05-03  7:51 UTC (permalink / raw)
  To: davem, netdev; +Cc: weiyongjun1

Now tg3 NIC's stats will be cleared after ifdown/ifup. bond_get_stats traverse
its salves to get statistics,cumulative the increment.If a tg3 NIC is added to
bonding as a slave,ifdown/ifup will cause bonding's stats become tremendous value
(ex.1638.3 PiB) because of negative increment.

Fixes: 92feeabf3f67 ("tg3: Save stats across chip resets")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/ethernet/broadcom/tg3.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 30d1eb9..29beba1 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -11722,10 +11722,6 @@ static int tg3_close(struct net_device *dev)
 
 	tg3_stop(tp);
 
-	/* Clear stats across close / open calls */
-	memset(&tp->net_stats_prev, 0, sizeof(tp->net_stats_prev));
-	memset(&tp->estats_prev, 0, sizeof(tp->estats_prev));
-
 	if (pci_device_is_present(tp->pdev)) {
 		tg3_power_down_prepare(tp);
 
-- 
2.5.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox