Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v3 0/6] rtnetlink: preparation patches for further rtnl lock pushdown/removal
From: Florian Westphal @ 2017-09-23 19:26 UTC (permalink / raw)
  To: netdev

First patch adds a rudimentary vrf test case.

Remaining patches split large rtnl_fill_ifinfo into smaller chunks
to better see which parts

1. require rtnl
2. do not require it at all
3. rely on rtnl locking now but could be converted

I removed all the ASSERT_RTNL spots that v1 and v2 added,
i will keep that back in my working branch since those
are just 'todo' markers for myself.

Eric Dumazet pointed out that qdiscs are now freed directly without
call_rcu so I dropped the patch that made this assumption from the series.

As the remaining patches did not see major changes vs. v2 I retained
all reviewed/acked-by tags from David Ahern.

^ permalink raw reply

* Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name
From: Florian Westphal @ 2017-09-23 18:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev
In-Reply-To: <1506187888.29839.181.camel@edumazet-glaptop3.roam.corp.google.com>

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-09-22 at 08:10 +0200, Florian Westphal wrote:
> > We can use rcu here to make this safe even if we would not hold rtnl:
> > qdisc_destroy uses call_rcu to free the Qdisc struct.
> 
> 
> Where do you see call_rcu() called from qdisc_destroy() ?
> 
> You missed this commit I guess
> 
> 752fbcc33405d6f8249465e4b2c4e420091bb825
> ("net_sched: no need to free qdisc in RCU callback")

Indeed, I did, patch dropped, thanks for the heads-up.

^ permalink raw reply

* [PATCH net-next] sch_netem: faster rb tree removal
From: Eric Dumazet @ 2017-09-23 18:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Stephen Hemminger

From: Eric Dumazet <edumazet@google.com>

While running TCP tests involving netem storing millions of packets,
I had the idea to speed up tfifo_reset() and did experiments.

I tried the rbtree_postorder_for_each_entry_safe() method that is
used in skb_rbtree_purge() but discovered it was slower than the
current tfifo_reset() method.

I measured time taken to release skbs with three occupation levels :
10^4, 10^5 and 10^6 skbs with three methods :

1) (current 'naive' method)

	while ((p = rb_first(&q->t_root))) {
		struct sk_buff *skb = netem_rb_to_skb(p);
 
		rb_erase(p, &q->t_root);
		rtnl_kfree_skbs(skb, skb);
	}

2) Use rb_next() instead of rb_first() in the loop :

	p = rb_first(&q->t_root);
	while (p) {
		struct sk_buff *skb = netem_rb_to_skb(p);

		p = rb_next(p);
		rb_erase(&skb->rbnode, &q->t_root);
		rtnl_kfree_skbs(skb, skb);
	}

3) "optimized" method using rbtree_postorder_for_each_entry_safe()

	struct sk_buff *skb, *next;

	rbtree_postorder_for_each_entry_safe(skb, next,
					     &q->t_root, rbnode) {
               rtnl_kfree_skbs(skb, skb);
	}
	q->t_root = RB_ROOT;

Results :

method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  10000 skbs in 541846 ns (54 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)

method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  100000 skbs in 5942456 ns (59 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)

method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  1000000 skbs in 82619635 ns (82 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)

Method 2) is simply faster, probably because it maintains a smaller
working size set.

Note that this is the method we use in tcp_ofo_queue() already.

I will also change skb_rbtree_purge() in a second patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/sched/sch_netem.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
 static void tfifo_reset(struct Qdisc *sch)
 {
 	struct netem_sched_data *q = qdisc_priv(sch);
-	struct rb_node *p;
+	struct rb_node *p = rb_first(&q->t_root);
 
-	while ((p = rb_first(&q->t_root))) {
+	while (p) {
 		struct sk_buff *skb = netem_rb_to_skb(p);
 
-		rb_erase(p, &q->t_root);
+		p = rb_next(p);
+		rb_erase(&skb->rbnode, &q->t_root);
 		rtnl_kfree_skbs(skb, skb);
 	}
 }

^ permalink raw reply related

* Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name
From: Eric Dumazet @ 2017-09-23 17:31 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev
In-Reply-To: <20170922061008.14723-4-fw@strlen.de>

On Fri, 2017-09-22 at 08:10 +0200, Florian Westphal wrote:
> We can use rcu here to make this safe even if we would not hold rtnl:
> qdisc_destroy uses call_rcu to free the Qdisc struct.


Where do you see call_rcu() called from qdisc_destroy() ?

You missed this commit I guess

752fbcc33405d6f8249465e4b2c4e420091bb825
("net_sched: no need to free qdisc in RCU callback")

^ permalink raw reply

* Re: [PATCH] net: dsa: avoid null pointer dereference on p->phy
From: Florian Fainelli @ 2017-09-23 17:21 UTC (permalink / raw)
  To: Colin King, Andrew Lunn, Vivien Didelot, David S . Miller, netdev
  Cc: kernel-janitors, linux-kernel
In-Reply-To: <20170923165720.18560-1-colin.king@canonical.com>

On 09/23/2017 09:57 AM, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> Currently p->phy is being null checked in several places to avoid
> null pointer dereferences on p->phy, however, the final call
> to phy_attached_info on p->phy when p->phy will perform a null
> pointer dereference. Fix this by simply moving the call into
> the previous code block that is only executed if p->phy is
> not null.
> 
> Detected by CoverityScan, CID#1457034 ("Dereference after null check")

The code flow is not exactly easy to read, but I don't see how we can
actually wind up in that situation because we check the return values of
of_phy_connect() and dsa_slave_phy_connect() earlier on.

> 
> Fixes: 2220943a21e2 ("phy: Centralise print about attached phy")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  net/dsa/slave.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 02ace7d462c4..29ab4e98639b 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev)
>  				of_phy_deregister_fixed_link(port_dn);
>  			return ret;
>  		}
> +		phy_attached_info(p->phy);
>  	}
>  
> -	phy_attached_info(p->phy);
> -
>  	return 0;
>  }
>  
> 

-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next v2 6/6] rtnetlink: rtnl_have_link_slave_info doesn't need rtnl
From: David Ahern @ 2017-09-23 17:17 UTC (permalink / raw)
  To: Florian Westphal, netdev
In-Reply-To: <20170922061008.14723-7-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> Switch it to rcu.
> 
> rtnl_link_slave_info_fill on to other hand does need rtnl mutex for now so
> add an annotation to its caller as a reminder.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  Change since v1:
>   - don't add ASSERT_RTNL to rtnl_link_slave_info_fill and
>   rtnl_link_info_fill they are only called via rtnl_link_fill.
> 
>  net/core/rtnetlink.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 590823f70cc3..e6f9e36d9d5a 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -522,11 +522,15 @@ static size_t rtnl_link_get_af_size(const struct net_device *dev,
>  static bool rtnl_have_link_slave_info(const struct net_device *dev)
>  {
>  	struct net_device *master_dev;
> +	bool ret = false;
>  
> -	master_dev = netdev_master_upper_dev_get((struct net_device *) dev);
> +	rcu_read_lock();
> +
> +	master_dev = netdev_master_upper_dev_get_rcu((struct net_device *)dev);
>  	if (master_dev && master_dev->rtnl_link_ops)
> -		return true;
> -	return false;
> +		ret = true;
> +	rcu_read_unlock();
> +	return ret;
>  }
>  
>  static int rtnl_link_slave_info_fill(struct sk_buff *skb,
> @@ -598,6 +602,8 @@ static int rtnl_link_fill(struct sk_buff *skb, const struct net_device *dev)
>  	struct nlattr *linkinfo;
>  	int err = -EMSGSIZE;
>  
> +	ASSERT_RTNL();

Rather than sprinkling the ASSERT_RTNL why not just add a comment above
the function which is done in many places. Since this is really meant as
your notes as you remove rtnl usage a comment serves the same purpose.

> +
>  	linkinfo = nla_nest_start(skb, IFLA_LINKINFO);
>  	if (linkinfo == NULL)
>  		goto out;
> 


Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v2 4/6] rtnetlink: add helper to dump ifalias
From: David Ahern @ 2017-09-23 17:14 UTC (permalink / raw)
  To: Florian Westphal, netdev
In-Reply-To: <20170922061008.14723-5-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> ifalias is currently protected by rtnl mutex, add assertion
> as a reminder.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/core/rtnetlink.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index ad3f27da37a8..42ff582a010e 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1345,6 +1345,16 @@ static int nla_put_qdisc(struct sk_buff *skb, struct net_device *dev)
>  	return ret;
>  }
>  
> +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device *dev)
> +{
> +	ASSERT_RTNL();

The assert is not needed given the code path.

> +
> +	if (dev->ifalias)
> +		return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias);
> +
> +	return 0;
> +}
> +
>  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  			    int type, u32 pid, u32 seq, u32 change,
>  			    unsigned int flags, u32 ext_filter_mask,
> @@ -1386,8 +1396,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  	    put_master_ifindex(skb, dev) ||
>  	    nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) ||
>  	    nla_put_qdisc(skb, dev) ||
> -	    (dev->ifalias &&
> -	     nla_put_string(skb, IFLA_IFALIAS, dev->ifalias)) ||
> +	    nla_put_ifalias(skb, dev) ||
>  	    nla_put_u32(skb, IFLA_CARRIER_CHANGES,
>  			atomic_read(&dev->carrier_changes)) ||
>  	    nla_put_u8(skb, IFLA_PROTO_DOWN, dev->proto_down))
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH] net: dsa: avoid null pointer dereference on p->phy
From: Joe Perches @ 2017-09-23 17:13 UTC (permalink / raw)
  To: Colin King, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, netdev
  Cc: kernel-janitors, linux-kernel
In-Reply-To: <20170923165720.18560-1-colin.king@canonical.com>

On Sat, 2017-09-23 at 17:57 +0100, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> Currently p->phy is being null checked in several places to avoid
> null pointer dereferences on p->phy, however, the final call
> to phy_attached_info on p->phy when p->phy will perform a null
> pointer dereference. Fix this by simply moving the call into
> the previous code block that is only executed if p->phy is
> not null.
> 
> Detected by CoverityScan, CID#1457034 ("Dereference after null check")
> 
> Fixes: 2220943a21e2 ("phy: Centralise print about attached phy")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  net/dsa/slave.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 02ace7d462c4..29ab4e98639b 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev)
>  				of_phy_deregister_fixed_link(port_dn);
>  			return ret;
>  		}
> +		phy_attached_info(p->phy);
>  	}
>  
> -	phy_attached_info(p->phy);
> -
>  	return 0;
>  }

Huh?  Why move this into the test?

The test of the block above this change is

	if (!p->phy) {

Perhaps this should be
'
	if (p->phy)
		phy_attached_info(p->phy);

or simpler

	} else {
		phy_attached_info(p->phy);
	}

or maybe reverse the block

	if (p->phy) {
		phy_attached_info(p->phy);
	} else {
		ret = dsa_slave_phy_connect(slave_dev, p->dp->index);
		if (ret) {
			netdev_err(slave_dev, "failed to connect to port %d: %d\n",
				   p->dp->index, ret);
			if (phy_is_fixed)
				of_phy_deregister_fixed_link(port_dn);
			return ret;
		}
	}

	return 0;
}

^ permalink raw reply

* Re: [PATCH net-next v2 5/6] rtnetlink: add helpers to dump vf and netnsid information
From: David Ahern @ 2017-09-23 17:12 UTC (permalink / raw)
  To: Florian Westphal, netdev
In-Reply-To: <20170922061008.14723-6-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> +static noinline_for_stack int rtnl_fill_vf(struct sk_buff *skb,
> +					   struct net_device *dev,
> +					   u32 ext_filter_mask)
> +{
> +	struct nlattr *vfinfo;
> +	int i, num_vfs;
> +
> +	if (!dev->dev.parent || ((ext_filter_mask & RTEXT_FILTER_VF) == 0))
> +		return 0;
> +
> +	num_vfs = dev_num_vf(dev->dev.parent);
> +	if (nla_put_u32(skb, IFLA_NUM_VF, num_vfs))
> +		return -EMSGSIZE;
> +
> +	if (!dev->netdev_ops->ndo_get_vf_config)
> +		return 0;
> +
> +	vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
> +	if (!vfinfo)
> +		return -EMSGSIZE;
> +
> +	for (i = 0; i < num_vfs; i++) {
> +		if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
> +			return -EMSGSIZE;
> +	}
> +
> +	nla_nest_end(skb, vfinfo);
> +	return 0;
> +}
> +
>  static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct rtnl_link_ifmap map;
> @@ -1355,6 +1385,23 @@ static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device *dev)
>  	return 0;
>  }
>  
> +static noinline int rtnl_fill_link_netnsid(struct sk_buff *skb,
> +					   const struct net_device *dev)
> +{
> +	if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) {
> +		struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
> +
> +		if (!net_eq(dev_net(dev), link_net)) {
> +			int id = peernet2id_alloc(dev_net(dev), link_net);
> +
> +			if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
> +				return -EMSGSIZE;
> +		}
> +	}
> +
> +	return 0;
> +}
> +

No reason to combine vf and netnsid into 1 patch; completely separate topics


>  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  			    int type, u32 pid, u32 seq, u32 change,
>  			    unsigned int flags, u32 ext_filter_mask,
> @@ -1428,27 +1475,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  	if (rtnl_fill_stats(skb, dev))
>  		goto nla_put_failure;
>  
> -	if (dev->dev.parent && (ext_filter_mask & RTEXT_FILTER_VF) &&
> -	    nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent)))
> +	if (rtnl_fill_vf(skb, dev, ext_filter_mask))
>  		goto nla_put_failure;
>  
> -	if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent &&
> -	    ext_filter_mask & RTEXT_FILTER_VF) {
> -		int i;
> -		struct nlattr *vfinfo;
> -		int num_vfs = dev_num_vf(dev->dev.parent);
> -
> -		vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
> -		if (!vfinfo)
> -			goto nla_put_failure;
> -		for (i = 0; i < num_vfs; i++) {
> -			if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
> -				goto nla_put_failure;
> -		}
> -
> -		nla_nest_end(skb, vfinfo);
> -	}
> -
>  	if (rtnl_port_fill(skb, dev, ext_filter_mask))
>  		goto nla_put_failure;
>  
> @@ -1460,17 +1489,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  			goto nla_put_failure;
>  	}
>  
> -	if (dev->rtnl_link_ops &&
> -	    dev->rtnl_link_ops->get_link_net) {
> -		struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
> -
> -		if (!net_eq(dev_net(dev), link_net)) {
> -			int id = peernet2id_alloc(dev_net(dev), link_net);
> -
> -			if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
> -				goto nla_put_failure;
> -		}
> -	}
> +	if (rtnl_fill_link_netnsid(skb, dev))
> +		goto nla_put_failure;
>  
>  	if (!(af_spec = nla_nest_start(skb, IFLA_AF_SPEC)))
>  		goto nla_put_failure;
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v2 1/6] selftests: rtnetlink.sh: add rudimentary vrf test
From: David Ahern @ 2017-09-23 17:07 UTC (permalink / raw)
  To: Florian Westphal, netdev
In-Reply-To: <20170922061008.14723-2-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> +kci_test_vrf()
> +{
> +	vrfname="test-vrf"
> +	ret=0
> +
> +	ip link show type vrf 2>/dev/null
> +	if [ $? -ne 0 ]; then
> +		echo "SKIP: vrf: iproute2 too old"
> +		return 0
> +	fi
> +
> +	ip link add "$vrfname" type vrf table 10
> +	check_err $?
> +	if [ $ret -ne 0 ];then
> +		echo "FAIL: can't add vrf interface, skipping test"
> +		return 0
> +	fi
> +
> +	ip -br link show type vrf | grep -q "$vrfname"
> +	check_err $?
> +	if [ $ret -ne 0 ];then
> +		echo "FAIL: created vrf device not found"
> +		return 1
> +	fi
> +
> +        ip link set dev "$vrfname" up

BTW, if there is a v3 of this set, that ip command is shifted - uses
spaces instead of tab.

^ permalink raw reply

* Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name
From: David Ahern @ 2017-09-23 17:03 UTC (permalink / raw)
  To: Florian Westphal, netdev
In-Reply-To: <20170922061008.14723-4-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> We can use rcu here to make this safe even if we would not hold rtnl:
> qdisc_destroy uses call_rcu to free the Qdisc struct.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/core/rtnetlink.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v2 2/6] rtnetlink: add helper to put master ifindex
From: David Ahern @ 2017-09-23 17:01 UTC (permalink / raw)
  To: Florian Westphal, netdev
In-Reply-To: <20170922061008.14723-3-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex.
> Unfortunately the function is quite large which makes it harder to see
> which spots need the lock, which spots assume it and which ones could do
> without.
> 
> Add helpers to factor out the ifindex dumping.
> 
> One helper can use rcu to remove rtnl dependency.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/core/rtnetlink.c | 32 +++++++++++++++++++++++++++-----
>  1 file changed, 27 insertions(+), 5 deletions(-)

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v2 1/6] selftests: rtnetlink.sh: add rudimentary vrf test
From: David Ahern @ 2017-09-23 16:59 UTC (permalink / raw)
  To: Florian Westphal, netdev; +Cc: David Ahern
In-Reply-To: <20170922061008.14723-2-fw@strlen.de>

On 9/22/17 12:10 AM, Florian Westphal wrote:
> Cc: David Ahern <dsahern@gmail.com>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  tools/testing/selftests/net/rtnetlink.sh | 42 ++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)

Acked-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test
From: David Miller @ 2017-09-23 16:58 UTC (permalink / raw)
  To: fathi.boudra
  Cc: orson.zhai, shuah, milosz.wasilewski, sumit.semwal, netdev,
	linux-kselftest
In-Reply-To: <CAGNsrLBe2uhX_EhUgK+nKrRYSF54tktDNg5+ejQc6ia0T7F=ew@mail.gmail.com>

From: Fathi Boudra <fathi.boudra@linaro.org>
Date: Sat, 23 Sep 2017 14:27:15 +0300

> On 23 September 2017 at 04:20, David Miller <davem@davemloft.net> wrote:
>> From: Orson Zhai <orson.zhai@linaro.org>
>> Date: Fri, 22 Sep 2017 18:17:17 +0800
>>
>>> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
>>> lower than v4.11. Supported code of tx ring was add with commit id
>>> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
>>> of 2017.
>>>
>>> So skip this item test instead of reporting failing for old kernels.
>>>
>>> Signed-off-by: Orson Zhai <orson.zhai@linaro.org>
>>
>> The whole point is to make sure the kernel in which the selftest
>> code is present functions properly.
>>
>> There are many tests in selftests that only work on recent kernels.
> 
> For the background, a similar discussion happened on this thread:
> https://lkml.org/lkml/2017/6/22/802
> 
> There's cases where we'd like to run latest selftests on stable kernels.
> You're right, there are many tests in selftests that only work on
> recent kernels and we intend to fix it.
> Skipping gracefully a test because the feature is missing on the
> kernel under test is preferred to fail.

This approach is extremely ill advised.

It is hard enough to get developers to add new tests in the first
place.

Having the extra burdon of needing to make the test work on older
kernels is going to discourage test writing even more.

If you want to "backport" tests, handle them the same way -stable
backports are done.  With extreme care and making sure they get
backported to the kernel they actually would work on.

^ permalink raw reply

* [PATCH] net: dsa: avoid null pointer dereference on p->phy
From: Colin King @ 2017-09-23 16:57 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S . Miller,
	netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Currently p->phy is being null checked in several places to avoid
null pointer dereferences on p->phy, however, the final call
to phy_attached_info on p->phy when p->phy will perform a null
pointer dereference. Fix this by simply moving the call into
the previous code block that is only executed if p->phy is
not null.

Detected by CoverityScan, CID#1457034 ("Dereference after null check")

Fixes: 2220943a21e2 ("phy: Centralise print about attached phy")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 net/dsa/slave.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..29ab4e98639b 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev)
 				of_phy_deregister_fixed_link(port_dn);
 			return ret;
 		}
+		phy_attached_info(p->phy);
 	}
 
-	phy_attached_info(p->phy);
-
 	return 0;
 }
 
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH net-next] liquidio: pass date and time info to NIC firmware
From: Andrew Lunn @ 2017-09-23 15:16 UTC (permalink / raw)
  To: Felix Manlunas
  Cc: davem, netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
	manish.awasthi, veerasenareddy.burru
In-Reply-To: <20170923003518.GA1583@felix-thinkpad.cavium.com>

On Fri, Sep 22, 2017 at 05:35:18PM -0700, Felix Manlunas wrote:
> From: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>

This is kind of interesting. So you do this once. It could be before
the RTC driver has probed, so it is 1970. It could be before the NTP
daemon has started, and so the host clock will later jump, or stretch
time to gain synchronisation.

It seems like you should be periodically giving the time to the
firmware, not just once.

   Andrew

^ permalink raw reply

* Re: [patch net-next 07/12] mlxsw: spectrum: Add the multicast routing offloading logic
From: Andrew Lunn @ 2017-09-23 14:41 UTC (permalink / raw)
  To: Yotam Gigi; +Cc: Jiri Pirko, netdev, davem, idosch, mlxsw
In-Reply-To: <a4193a32-7ba3-a13f-fa21-da247228be74@mellanox.com>

> > So when i look at these patches, i try to make sure the general use
> > cases works, not just the plain boring Ethernet switch box use cases
> > :-)
> 
> So when doing it, we did think about multi-ASIC situations, so I think it should
> fit :)

Maybe, maybe not. DSA is multi switch. That is what the D means,
Distributed. All switches in a cluster are presented to the software
stack as a single switch.

If your multi-ASIC architecture is the same, a distributed switch,
then you don't cover the general case. If you represent it as two
switches, with frames going between switches going via CPU, then yes,
you probably cover the general case.

    Andrew

^ permalink raw reply

* [PATCH net-next] tun: delete original tun_get() and rename __tun_get() to tun_get()
From: yuan linyu @ 2017-09-23 14:36 UTC (permalink / raw)
  To: netdev; +Cc: David S . Miller, yuan linyu

From: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>

it seems no need to keep tun_get() and __tun_get() at same time.

Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
---
 drivers/net/tun.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3c9985f..206bc6c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -692,7 +692,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
 	return err;
 }
 
-static struct tun_struct *__tun_get(struct tun_file *tfile)
+static struct tun_struct *tun_get(struct tun_file *tfile)
 {
 	struct tun_struct *tun;
 
@@ -705,11 +705,6 @@ static struct tun_struct *__tun_get(struct tun_file *tfile)
 	return tun;
 }
 
-static struct tun_struct *tun_get(struct file *file)
-{
-	return __tun_get(file->private_data);
-}
-
 static void tun_put(struct tun_struct *tun)
 {
 	dev_put(tun->dev);
@@ -1149,7 +1144,7 @@ static void tun_net_init(struct net_device *dev)
 static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
 {
 	struct tun_file *tfile = file->private_data;
-	struct tun_struct *tun = __tun_get(tfile);
+	struct tun_struct *tun = tun_get(tfile);
 	struct sock *sk;
 	unsigned int mask = 0;
 
@@ -1569,8 +1564,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 static ssize_t tun_chr_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
-	struct tun_struct *tun = tun_get(file);
 	struct tun_file *tfile = file->private_data;
+	struct tun_struct *tun = tun_get(tfile);
 	ssize_t result;
 
 	if (!tun)
@@ -1754,7 +1749,7 @@ static ssize_t tun_chr_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct file *file = iocb->ki_filp;
 	struct tun_file *tfile = file->private_data;
-	struct tun_struct *tun = __tun_get(tfile);
+	struct tun_struct *tun = tun_get(tfile);
 	ssize_t len = iov_iter_count(to), ret;
 
 	if (!tun)
@@ -1831,7 +1826,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
 {
 	int ret;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
-	struct tun_struct *tun = __tun_get(tfile);
+	struct tun_struct *tun = tun_get(tfile);
 
 	if (!tun)
 		return -EBADFD;
@@ -1847,7 +1842,7 @@ static int tun_recvmsg(struct socket *sock, struct msghdr *m, size_t total_len,
 		       int flags)
 {
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
-	struct tun_struct *tun = __tun_get(tfile);
+	struct tun_struct *tun = tun_get(tfile);
 	int ret;
 
 	if (!tun)
@@ -1879,7 +1874,7 @@ static int tun_peek_len(struct socket *sock)
 	struct tun_struct *tun;
 	int ret = 0;
 
-	tun = __tun_get(tfile);
+	tun = tun_get(tfile);
 	if (!tun)
 		return 0;
 
@@ -2265,7 +2260,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 	ret = 0;
 	rtnl_lock();
 
-	tun = __tun_get(tfile);
+	tun = tun_get(tfile);
 	if (cmd == TUNSETIFF) {
 		ret = -EEXIST;
 		if (tun)
@@ -2612,15 +2607,16 @@ static int tun_chr_close(struct inode *inode, struct file *file)
 }
 
 #ifdef CONFIG_PROC_FS
-static void tun_chr_show_fdinfo(struct seq_file *m, struct file *f)
+static void tun_chr_show_fdinfo(struct seq_file *m, struct file *file)
 {
+	struct tun_file *tfile = file->private_data;
 	struct tun_struct *tun;
 	struct ifreq ifr;
 
 	memset(&ifr, 0, sizeof(ifr));
 
 	rtnl_lock();
-	tun = tun_get(f);
+	tun = tun_get(tfile);
 	if (tun)
 		tun_get_iff(current->nsproxy->net_ns, tun, &ifr);
 	rtnl_unlock();
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic
From: Andrew Lunn @ 2017-09-23 14:31 UTC (permalink / raw)
  To: Egil Hjelmeland; +Cc: vivien.didelot, f.fainelli, netdev, linux-kernel
In-Reply-To: <2c5fec6d-18b3-97e9-dd64-85560382d2f7@egil-hjelmeland.no>

> The point is: Once both external ports are in "forwarding", I see no way
> to prevent traffic flowing directly between the external ports.

Generally, there are port vectors. Port X can send frames only to Port
Y.

If you don't have that, there are possibilities with VLANs. Each port
is given a unique VLAN. All incoming untagged traffic is tagged with
the VLAN. You just need to keep the VLAN separated and add/remove the
VLAN tag in the dsa tag driver.

     Andrew

^ permalink raw reply

* Re: [PATCH net] sctp: Fix a big endian bug in sctp_for_each_transport()
From: Xin Long @ 2017-09-23 11:37 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Vlad Yasevich, Neil Horman, David S. Miller, linux-sctp,
	network dev, kernel-janitors, Marcelo Ricardo Leitner
In-Reply-To: <20170923102537.wra55nqyqsfoduxo@mwanda>

On Sat, Sep 23, 2017 at 6:25 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> Fundamentally, the "pos" pointer points to "cb->args[2]" which is a long.
> In the current code, we only use the high 32 bits and cast it as an
> int.  That works on little endian systems but will fail on big endian
> systems.
>
> Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index d7d8cba01469..7d87439f299a 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -121,14 +121,14 @@ void sctp_transport_walk_stop(struct rhashtable_iter *iter);
>  struct sctp_transport *sctp_transport_get_next(struct net *net,
>                         struct rhashtable_iter *iter);
>  struct sctp_transport *sctp_transport_get_idx(struct net *net,
> -                       struct rhashtable_iter *iter, int pos);
> +                       struct rhashtable_iter *iter, long pos);
>  int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *),
>                                   struct net *net,
>                                   const union sctp_addr *laddr,
>                                   const union sctp_addr *paddr, void *p);
>  int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
>                             int (*cb_done)(struct sctp_transport *, void *),
> -                           struct net *net, int *pos, void *p);
> +                           struct net *net, long *pos, void *p);
>  int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *), void *p);
>  int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc,
>                        struct sctp_info *info);
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index d4730ada7f32..0222743b3aa8 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -4603,7 +4603,7 @@ struct sctp_transport *sctp_transport_get_next(struct net *net,
>
>  struct sctp_transport *sctp_transport_get_idx(struct net *net,
>                                               struct rhashtable_iter *iter,
> -                                             int pos)
> +                                             long pos)
>  {
>         void *obj = SEQ_START_TOKEN;
>
> @@ -4659,7 +4659,7 @@ EXPORT_SYMBOL_GPL(sctp_transport_lookup_process);
>
>  int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
>                             int (*cb_done)(struct sctp_transport *, void *),
> -                           struct net *net, int *pos, void *p) {
> +                           struct net *net, long *pos, void *p) {
>         struct rhashtable_iter hti;
>         struct sctp_transport *tsp;
>         int ret;
> diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
> index 22ed01a76b19..e9d5405aa6ac 100644
> --- a/net/sctp/sctp_diag.c
> +++ b/net/sctp/sctp_diag.c
> @@ -493,7 +493,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
>                 goto done;
>
>         sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump,
> -                               net, (int *)&cb->args[2], &commp);
> +                               net, &cb->args[2], &commp);
>
>  done:
>         cb->args[1] = cb->args[4];
Better not to touch sctp_for_each_transport and sctp_transport_get_idx
which are supposed to be also used elsewhere as common apis.

Can you pls try to fix this in sctp_diag_dump(), like:
@@ -463,6 +463,7 @@ static void sctp_diag_dump(struct sk_buff *skb,
struct netlink_callback *cb,
                .r = r,
                .net_admin = netlink_net_capable(cb->skb, CAP_NET_ADMIN),
        };
+       int pos = cb->args[2];

        /* eps hashtable dumps
         * args:
@@ -493,7 +494,8 @@ static void sctp_diag_dump(struct sk_buff *skb,
struct netlink_callback *cb,
                goto done;

        sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump,
-                               net, (int *)&cb->args[2], &commp);
+                               net, &pos, &commp);
+       cb->args[2] = pos;

^ permalink raw reply

* Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test
From: Fathi Boudra @ 2017-09-23 11:27 UTC (permalink / raw)
  To: David Miller
  Cc: orson.zhai, Shuah Khan, Milosz Wasilewski,
	sumit.semwal@linaro.org, netdev, linux-kselftest
In-Reply-To: <20170922.182017.17747017677768533.davem@davemloft.net>

On 23 September 2017 at 04:20, David Miller <davem@davemloft.net> wrote:
> From: Orson Zhai <orson.zhai@linaro.org>
> Date: Fri, 22 Sep 2017 18:17:17 +0800
>
>> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
>> lower than v4.11. Supported code of tx ring was add with commit id
>> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
>> of 2017.
>>
>> So skip this item test instead of reporting failing for old kernels.
>>
>> Signed-off-by: Orson Zhai <orson.zhai@linaro.org>
>
> The whole point is to make sure the kernel in which the selftest
> code is present functions properly.
>
> There are many tests in selftests that only work on recent kernels.

For the background, a similar discussion happened on this thread:
https://lkml.org/lkml/2017/6/22/802

There's cases where we'd like to run latest selftests on stable kernels.
You're right, there are many tests in selftests that only work on
recent kernels and we intend to fix it.
Skipping gracefully a test because the feature is missing on the
kernel under test is preferred to fail.

> I'm not applying this, sorry.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next] cxgb4: do DCB state reset in couple of places
From: Ganesh Goudar @ 2017-09-23 10:37 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar, Casey Leedom

reset the driver's DCB state in couple of places
where it was missing.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c  | 15 +++++++++++----
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h  |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 10 ++++++++--
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
index 6ee2ed3..4e7f72b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
@@ -40,8 +40,7 @@ static inline bool cxgb4_dcb_state_synced(enum cxgb4_dcb_state state)
 		return false;
 }
 
-/* Initialize a port's Data Center Bridging state.  Typically used after a
- * Link Down event.
+/* Initialize a port's Data Center Bridging state.
  */
 void cxgb4_dcb_state_init(struct net_device *dev)
 {
@@ -106,6 +105,15 @@ static void cxgb4_dcb_cleanup_apps(struct net_device *dev)
 	}
 }
 
+/* Reset a port's Data Center Bridging state.  Typically used after a
+ * Link Down event.
+ */
+void cxgb4_dcb_reset(struct net_device *dev)
+{
+	cxgb4_dcb_cleanup_apps(dev);
+	cxgb4_dcb_state_init(dev);
+}
+
 /* Finite State machine for Data Center Bridging.
  */
 void cxgb4_dcb_state_fsm(struct net_device *dev,
@@ -194,8 +202,7 @@ void cxgb4_dcb_state_fsm(struct net_device *dev,
 			 * state.  We need to reset back to a ground state
 			 * of incomplete.
 			 */
-			cxgb4_dcb_cleanup_apps(dev);
-			cxgb4_dcb_state_init(dev);
+			cxgb4_dcb_reset(dev);
 			dcb->state = CXGB4_DCB_STATE_FW_INCOMPLETE;
 			dcb->supported = CXGB4_DCBX_FW_SUPPORT;
 			linkwatch_fire_event(dev);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h
index ccf24d3..02040b9 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h
@@ -131,6 +131,7 @@ struct port_dcb_info {
 
 void cxgb4_dcb_state_init(struct net_device *);
 void cxgb4_dcb_version_init(struct net_device *);
+void cxgb4_dcb_reset(struct net_device *dev);
 void cxgb4_dcb_state_fsm(struct net_device *, enum cxgb4_dcb_state_input);
 void cxgb4_dcb_handle_fw_update(struct adapter *, const struct fw_port_cmd *);
 void cxgb4_dcb_set_caps(struct adapter *, const struct fw_port_cmd *);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index aa93ae9..13b636b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -281,7 +281,7 @@ void t4_os_link_changed(struct adapter *adapter, int port_id, int link_stat)
 		else {
 #ifdef CONFIG_CHELSIO_T4_DCB
 			if (cxgb4_dcb_enabled(dev)) {
-				cxgb4_dcb_state_init(dev);
+				cxgb4_dcb_reset(dev);
 				dcb_tx_queue_prio_enable(dev, false);
 			}
 #endif /* CONFIG_CHELSIO_T4_DCB */
@@ -2304,10 +2304,16 @@ static int cxgb_close(struct net_device *dev)
 {
 	struct port_info *pi = netdev_priv(dev);
 	struct adapter *adapter = pi->adapter;
+	int ret;
 
 	netif_tx_stop_all_queues(dev);
 	netif_carrier_off(dev);
-	return t4_enable_vi(adapter, adapter->pf, pi->viid, false, false);
+	ret = t4_enable_vi(adapter, adapter->pf, pi->viid, false, false);
+#ifdef CONFIG_CHELSIO_T4_DCB
+	cxgb4_dcb_reset(dev);
+	dcb_tx_queue_prio_enable(dev, false);
+#endif
+	return ret;
 }
 
 int cxgb4_create_server_filter(const struct net_device *dev, unsigned int stid,
-- 
2.1.0

^ permalink raw reply related

* [PATCH net] sctp: Fix a big endian bug in sctp_for_each_transport()
From: Dan Carpenter @ 2017-09-23 10:25 UTC (permalink / raw)
  To: Vlad Yasevich, Xin Long
  Cc: Neil Horman, David S. Miller, linux-sctp, netdev, kernel-janitors

Fundamentally, the "pos" pointer points to "cb->args[2]" which is a long.
In the current code, we only use the high 32 bits and cast it as an
int.  That works on little endian systems but will fail on big endian
systems.

Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index d7d8cba01469..7d87439f299a 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -121,14 +121,14 @@ void sctp_transport_walk_stop(struct rhashtable_iter *iter);
 struct sctp_transport *sctp_transport_get_next(struct net *net,
 			struct rhashtable_iter *iter);
 struct sctp_transport *sctp_transport_get_idx(struct net *net,
-			struct rhashtable_iter *iter, int pos);
+			struct rhashtable_iter *iter, long pos);
 int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *),
 				  struct net *net,
 				  const union sctp_addr *laddr,
 				  const union sctp_addr *paddr, void *p);
 int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
 			    int (*cb_done)(struct sctp_transport *, void *),
-			    struct net *net, int *pos, void *p);
+			    struct net *net, long *pos, void *p);
 int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *), void *p);
 int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc,
 		       struct sctp_info *info);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index d4730ada7f32..0222743b3aa8 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4603,7 +4603,7 @@ struct sctp_transport *sctp_transport_get_next(struct net *net,
 
 struct sctp_transport *sctp_transport_get_idx(struct net *net,
 					      struct rhashtable_iter *iter,
-					      int pos)
+					      long pos)
 {
 	void *obj = SEQ_START_TOKEN;
 
@@ -4659,7 +4659,7 @@ EXPORT_SYMBOL_GPL(sctp_transport_lookup_process);
 
 int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
 			    int (*cb_done)(struct sctp_transport *, void *),
-			    struct net *net, int *pos, void *p) {
+			    struct net *net, long *pos, void *p) {
 	struct rhashtable_iter hti;
 	struct sctp_transport *tsp;
 	int ret;
diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
index 22ed01a76b19..e9d5405aa6ac 100644
--- a/net/sctp/sctp_diag.c
+++ b/net/sctp/sctp_diag.c
@@ -493,7 +493,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
 		goto done;
 
 	sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump,
-				net, (int *)&cb->args[2], &commp);
+				net, &cb->args[2], &commp);
 
 done:
 	cb->args[1] = cb->args[4];

^ permalink raw reply related

* Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic
From: Egil Hjelmeland @ 2017-09-23  9:58 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: vivien.didelot, f.fainelli, netdev, linux-kernel
In-Reply-To: <20170922200810.GJ3470@lunn.ch>

Den 22. sep. 2017 22:08, skrev Andrew Lunn:
>>> I'm wondering how this is supposed to work. Please add a good comment
>>> here, since the hardware is forcing you to do something odd.
>>>
>>> Maybe it would be a good idea to save the STP state in chip.  And then
>>> when chip->is_bridged is set true, change the state in the hardware to
>>> the saved value?
>>>
>>> What happens when port 0 is added to the bridge, there is then a
>>> minute pause and then port 1 is added? I would expect that as soon as
>>> port 0 is added, the STP state machine for port 0 will start and move
>>> into listening and then forwarding. Due to hardware limitations it
>>> looks like you cannot do this. So what state is the hardware
>>> effectively in? Blocking? Forwarding?
>>>
>>> Then port 1 is added. You can then can respect the states. port 1 will
>>> do blocking->listening->forwarding, but what about port 0? The calls
>>> won't get repeated? How does it transition to forwarding?
>>>
>>>    Andrew
>>>
>>
>> I see your point with the "minute pause" argument. Although a bit
>> contrived use case, it is easy to fix by caching the STP state, as
>> you suggest. So I can do that.
> 
> I don't think it is contrived. I've done bridge configuration by hand
> for testing purposes. I've also set the forwarding delay to very small
> values, so there is a clear race condition here.
> 
>> How does other DSA HW chips handle port separation? Knowing that
>> could perhaps help me know what to look for.
> 
> They have better hardware :-)
> 
> Generally each port is totally independent. You can change the STP
> state per port without restrictions.
> 
We can indeed change the STP state per lan9303 port "without
restrictions".

The point is: Once both external ports are in "forwarding", I see no way
to prevent traffic flowing directly between the external ports.


>        Andrew
> 

Egil

^ permalink raw reply

* Re: [patch net-next 07/12] mlxsw: spectrum: Add the multicast routing offloading logic
From: Yotam Gigi @ 2017-09-23  9:32 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Jiri Pirko, netdev, davem, idosch, mlxsw
In-Reply-To: <20170922132152.GB31634@lunn.ch>

On 09/22/2017 04:21 PM, Andrew Lunn wrote:
> On Fri, Sep 22, 2017 at 11:36:59AM +0300, Yotam Gigi wrote:
>> On 09/21/2017 06:26 PM, Andrew Lunn wrote:
>>>> +static void mlxsw_sp_mr_route_stats_update(struct mlxsw_sp *mlxsw_sp,
>>>> +					   struct mlxsw_sp_mr_route *mr_route)
>>>> +{
>>>> +	struct mlxsw_sp_mr *mr = mlxsw_sp->mr;
>>>> +	u64 packets, bytes;
>>>> +
>>>> +	if (mr_route->route_action == MLXSW_SP_MR_ROUTE_ACTION_TRAP)
>>>> +		return;
>>>> +
>>>> +	mr->mr_ops->route_stats(mlxsw_sp, mr_route->route_priv, &packets,
>>>> +				&bytes);
>>>> +
>>>> +	switch (mr_route->mr_table->proto) {
>>>> +	case MLXSW_SP_L3_PROTO_IPV4:
>>>> +		mr_route->mfc4->mfc_un.res.pkt = packets;
>>>> +		mr_route->mfc4->mfc_un.res.bytes = bytes;
>>> What about wrong_if and lastuse? 
> Hi Yotam
>
>> wronf_if is updated by ipmr as it is trapped to the CPU.
> Great.
>
>> We did not address lastuse currently, though it can be easily
>> addressed here.
> Please do. I've written multicast routing daemons, where i use it to
> flush out MFCs which are no longer in use. Having it always 0 is going
> to break daemons.

I will. Thanks for the feedback!

>  
>>> Is an mfc with iif on the host, not the switch, not offloaded?
>>
>> I am not sure I followed. What do you mean MFC with iif on the host? you mean
>> MFC with iif that is an external NIC which is not part of the spectrum ASIC?
> Yes. We probably have different perspectives on the world. To
> Mellanox, everything is a switch in a box. In the DSA world, we tend
> to think of having a general purpose machine which also has a switch
> connected. Think of a wireless access point, set top box, passenger
> entertainment system. We have applications on the general purpose
> computer, we have wifi interfaces, cable modems, etc. Think about all
> the different packages in LEDE. We might have a multicast video
> stream, coming from the cable modem being sent over ports of the
> switch to clients.
>
> So when i look at these patches, i try to make sure the general use
> cases works, not just the plain boring Ethernet switch box use cases
> :-)

So when doing it, we did think about multi-ASIC situations, so I think it should
fit :)

>
>> in this case, the route will not be offloaded and all traffic will
>> pass in slowpath.
> O.K. I was just thinking if those counters need to be +=, not =.  But
> either the iif is on the host, or it is in the switch. It cannot be
> both. So = is O.K.
>
> Thanks
> 	Andrew

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox