Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v3 net-next 1/2] net: ethernet: slicoss: add slicoss gigabit ethernet driver
From: Markus Böhme @ 2016-11-29 12:15 UTC (permalink / raw)
  To: Lino Sanfilippo, davem, charrer, liodot, gregkh, andrew
  Cc: devel, netdev, linux-kernel
In-Reply-To: <9a789bda-7d98-fa2c-8759-e046c6dd5145@gmx.de>

On 11/28/2016 09:46 PM, Lino Sanfilippo wrote:
> Hi Markus,
> 
> On 27.11.2016 18:59, Markus Böhme wrote:
>> Hello Lino,
>>
>> just some things barely worth mentioning:
>>
> 
>>
>> I found a bunch of unused #defines in slic.h. I cannot judge if they are
>> worth keeping:
>>
>> 	SLIC_VRHSTATB_LONGE
>> 	SLIC_VRHSTATB_PREA
>> 	SLIC_ISR_IO
>> 	SLIC_ISR_PING_MASK
>> 	SLIC_GIG_SPEED_MASK
>> 	SLIC_GMCR_RESET
>> 	SLIC_XCR_RESET
>> 	SLIC_XCR_XMTEN
>> 	SLIC_XCR_PAUSEEN
>> 	SLIC_XCR_LOADRNG
>> 	SLIC_REG_DBAR
>> 	SLIC_REG_PING
>> 	SLIC_REG_DUMP_CMD
>> 	SLIC_REG_DUMP_DATA
>> 	SLIC_REG_WRHOSTID
>> 	SLIC_REG_LOW_POWER
>> 	SLIC_REG_RESET_IFACE
>> 	SLIC_REG_ADDR_UPPER
>> 	SLIC_REG_HBAR64
>> 	SLIC_REG_DBAR64
>> 	SLIC_REG_CBAR64
>> 	SLIC_REG_RBAR64
>> 	SLIC_REG_WRVLANID
>> 	SLIC_REG_READ_XF_INFO
>> 	SLIC_REG_WRITE_XF_INFO
>> 	SLIC_REG_TICKS_PER_SEC
>>
>> These device IDs are not used, either, but maybe it's good to keep them
>> for documentation purposes:
>>
>> 	PCI_SUBDEVICE_ID_ALACRITECH_1000X1_2
>> 	PCI_SUBDEVICE_ID_ALACRITECH_SES1001T
>> 	PCI_SUBDEVICE_ID_ALACRITECH_SEN2002XT
>> 	PCI_SUBDEVICE_ID_ALACRITECH_SEN2001XT
>> 	PCI_SUBDEVICE_ID_ALACRITECH_SEN2104ET
>> 	PCI_SUBDEVICE_ID_ALACRITECH_SEN2102ET
>>
> 
> I left these defines in for both documentation and to avoid gaps in
> register ranges. I would like to keep this as it is.

Seems reasonable.

[...]

>>> +static int slic_init_tx_queue(struct slic_device *sdev)
>>> +{
>>> +	struct slic_tx_queue *txq = &sdev->txq;
>>> +	struct slic_tx_buffer *buff;
>>> +	struct slic_tx_desc *desc;
>>> +	int err;
>>> +	int i;
>>
>> You could make i unsigned...
>>
> 
>>> +
>>> +	txq->len = SLIC_NUM_TX_DESCS;
>>> +	txq->put_idx = 0;
>>> +	txq->done_idx = 0;
>>> +
>>> +	txq->txbuffs = kcalloc(txq->len, sizeof(*buff), GFP_KERNEL);
>>> +	if (!txq->txbuffs)
>>> +		return -ENOMEM;
>>> +
>>> +	txq->dma_pool = dma_pool_create("slic_pool", &sdev->pdev->dev,
>>> +					sizeof(*desc), SLIC_TX_DESC_ALIGN,
>>> +					4096);
>>> +	if (!txq->dma_pool) {
>>> +		err = -ENOMEM;
>>> +		netdev_err(sdev->netdev, "failed to create dma pool\n");
>>> +		goto free_buffs;
>>> +	}
>>> +
>>> +	for (i = 0; i < txq->len; i++) {
>>
>> ...to fix a signed/unsigned comparison warning here, but...
>>
>>> +		buff = &txq->txbuffs[i];
>>> +		desc = dma_pool_zalloc(txq->dma_pool, GFP_KERNEL,
>>> +				       &buff->desc_paddr);
>>> +		if (!desc) {
>>> +			netdev_err(sdev->netdev,
>>> +				   "failed to alloc pool chunk (%i)\n", i);
>>> +			err = -ENOMEM;
>>> +			goto free_descs;
>>> +		}
>>> +
>>> +		desc->hnd = cpu_to_le32((u32)(i + 1));
>>> +		desc->cmd = SLIC_CMD_XMT_REQ;
>>> +		desc->flags = 0;
>>> +		desc->type = cpu_to_le32(SLIC_CMD_TYPE_DUMB);
>>> +		buff->desc = desc;
>>> +	}
>>> +
>>> +	return 0;
>>> +
>>> +free_descs:
>>> +	while (i--) {
>>
>> ...this would require reworking this logic to prevent an endless loop,
>> so probably not worth bothering, considering that txq->len is well
>> within the positive signed range.
> 
> AFAICS the logic does not have to be changed. The while loop will also work
> fine if "i" is unsigned.
> 

My bad! Of course you are right.


Regards,
Markus

^ permalink raw reply

* Re: [PATCH net 3/5] l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
From: Guillaume Nault @ 2016-11-29 12:12 UTC (permalink / raw)
  To: kbuild test robot; +Cc: kbuild-all, netdev, James Chapman, Chris Elston
In-Reply-To: <201611290515.BXajf2gX%fengguang.wu@intel.com>

On Tue, Nov 29, 2016 at 05:10:40AM +0800, kbuild test robot wrote:
> Hi Guillaume,
> 
> [auto build test WARNING on net/master]
> 
> url:    https://github.com/0day-ci/linux/commits/Guillaume-Nault/l2tp-fixes-for-l2tp_ip-and-l2tp_ip6-socket-handling/20161129-043208
> config: x86_64-rhel (attached as .config)
> compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64 
> 
> Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
> http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
> 
> All warnings (new ones prefixed by >>):
> 
>    net/l2tp/l2tp_ip.c: In function 'l2tp_ip_bind':
> >> net/l2tp/l2tp_ip.c:299:9: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
>      return ret;
>             ^~~
> 
I should have seen this earlier, sorry.
This is now fixed in v2.

^ permalink raw reply

* [PATCH v2 net 2/5] l2tp: hold socket before dropping lock in l2tp_ip{,6}_recv()
From: Guillaume Nault @ 2016-11-29 12:09 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman, Chris Elston
In-Reply-To: <cover.1480419658.git.g.nault@alphalink.fr>

Socket must be held while under the protection of the l2tp lock; there
is no guarantee that sk remains valid after the read_unlock_bh() call.

Same issue for l2tp_ip and l2tp_ip6.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 net/l2tp/l2tp_ip.c  | 11 ++++++-----
 net/l2tp/l2tp_ip6.c | 11 ++++++-----
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 1f57094..4d1c942 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -183,14 +183,15 @@ static int l2tp_ip_recv(struct sk_buff *skb)
 
 		read_lock_bh(&l2tp_ip_lock);
 		sk = __l2tp_ip_bind_lookup(net, iph->daddr, 0, tunnel_id);
+		if (!sk) {
+			read_unlock_bh(&l2tp_ip_lock);
+			goto discard;
+		}
+
+		sock_hold(sk);
 		read_unlock_bh(&l2tp_ip_lock);
 	}
 
-	if (sk == NULL)
-		goto discard;
-
-	sock_hold(sk);
-
 	if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
 		goto discard_put;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index af9abff..e3fc778 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -198,14 +198,15 @@ static int l2tp_ip6_recv(struct sk_buff *skb)
 		read_lock_bh(&l2tp_ip6_lock);
 		sk = __l2tp_ip6_bind_lookup(net, &iph->daddr,
 					    0, tunnel_id);
+		if (!sk) {
+			read_unlock_bh(&l2tp_ip6_lock);
+			goto discard;
+		}
+
+		sock_hold(sk);
 		read_unlock_bh(&l2tp_ip6_lock);
 	}
 
-	if (sk == NULL)
-		goto discard;
-
-	sock_hold(sk);
-
 	if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
 		goto discard_put;
 
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 net 5/5] l2tp: fix address test in __l2tp_ip6_bind_lookup()
From: Guillaume Nault @ 2016-11-29 12:09 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman, Chris Elston
In-Reply-To: <cover.1480419658.git.g.nault@alphalink.fr>

The '!(addr && ipv6_addr_equal(addr, laddr))' part of the conditional
matches if addr is NULL or if addr != laddr.
But the intend of __l2tp_ip6_bind_lookup() is to find a sockets with
the same address, so the ipv6_addr_equal() condition needs to be
inverted.

For better clarity and consistency with the rest of the expression, the
(!X || X == Y) notation is used instead of !(X && X != Y).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 net/l2tp/l2tp_ip6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 4a86440..aa821cb 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -72,7 +72,7 @@ static struct sock *__l2tp_ip6_bind_lookup(struct net *net,
 
 		if ((l2tp->conn_id == tunnel_id) &&
 		    net_eq(sock_net(sk), net) &&
-		    !(addr && ipv6_addr_equal(addr, laddr)) &&
+		    (!addr || ipv6_addr_equal(addr, laddr)) &&
 		    (!sk->sk_bound_dev_if || !dif ||
 		     sk->sk_bound_dev_if == dif))
 			goto found;
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 net 3/5] l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
From: Guillaume Nault @ 2016-11-29 12:09 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman, Chris Elston
In-Reply-To: <cover.1480419658.git.g.nault@alphalink.fr>

It's not enough to check for sockets bound to same address at the
beginning of l2tp_ip{,6}_bind(): even if no socket is found at that
time, a socket with the same address could be bound before we take
the l2tp lock again.

This patch moves the lookup right before inserting the new socket, so
that no change can ever happen to the list between address lookup and
socket insertion.

Care is taken to avoid side effects on the socket in case of failure.
That is, modifications of the socket are done after the lookup, when
binding is guaranteed to succeed, and before releasing the l2tp lock,
so that concurrent lookups will always see fully initialised sockets.

For l2tp_ip, 'ret' is set to -EINVAL before checking the SOCK_ZAPPED
bit. Error code was mistakenly set to -EADDRINUSE on error by commit
32c231164b76 ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").
Using -EINVAL restores original behaviour.

For l2tp_ip6, the lookup is now always done with the correct bound
device. Before this patch, when binding to a link-local address, the
lookup was done with the original sk->sk_bound_dev_if, which was later
overwritten with addr->l2tp_scope_id. Lookup is now performed with the
final sk->sk_bound_dev_if value.

Finally, the (addr_len >= sizeof(struct sockaddr_in6)) check has been
dropped: addr is a sockaddr_l2tpip6 not sockaddr_in6 and addr_len has
already been checked at this point (this part of the code seems to have
been copy-pasted from net/ipv6/raw.c).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 net/l2tp/l2tp_ip.c  | 27 ++++++++++++---------------
 net/l2tp/l2tp_ip6.c | 43 ++++++++++++++++++++-----------------------
 2 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4d1c942..b517c33 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -257,15 +257,9 @@ static int l2tp_ip_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	if (addr->l2tp_family != AF_INET)
 		return -EINVAL;
 
-	ret = -EADDRINUSE;
-	read_lock_bh(&l2tp_ip_lock);
-	if (__l2tp_ip_bind_lookup(net, addr->l2tp_addr.s_addr,
-				  sk->sk_bound_dev_if, addr->l2tp_conn_id))
-		goto out_in_use;
-
-	read_unlock_bh(&l2tp_ip_lock);
-
 	lock_sock(sk);
+
+	ret = -EINVAL;
 	if (!sock_flag(sk, SOCK_ZAPPED))
 		goto out;
 
@@ -282,14 +276,22 @@ static int l2tp_ip_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 		inet->inet_rcv_saddr = inet->inet_saddr = addr->l2tp_addr.s_addr;
 	if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST)
 		inet->inet_saddr = 0;  /* Use device */
-	sk_dst_reset(sk);
 
+	write_lock_bh(&l2tp_ip_lock);
+	if (__l2tp_ip_bind_lookup(net, addr->l2tp_addr.s_addr,
+				  sk->sk_bound_dev_if, addr->l2tp_conn_id)) {
+		write_unlock_bh(&l2tp_ip_lock);
+		ret = -EADDRINUSE;
+		goto out;
+	}
+
+	sk_dst_reset(sk);
 	l2tp_ip_sk(sk)->conn_id = addr->l2tp_conn_id;
 
-	write_lock_bh(&l2tp_ip_lock);
 	sk_add_bind_node(sk, &l2tp_ip_bind_table);
 	sk_del_node_init(sk);
 	write_unlock_bh(&l2tp_ip_lock);
+
 	ret = 0;
 	sock_reset_flag(sk, SOCK_ZAPPED);
 
@@ -297,11 +299,6 @@ static int l2tp_ip_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	release_sock(sk);
 
 	return ret;
-
-out_in_use:
-	read_unlock_bh(&l2tp_ip_lock);
-
-	return ret;
 }
 
 static int l2tp_ip_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index e3fc778..5f2ae61 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -267,6 +267,7 @@ static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	struct sockaddr_l2tpip6 *addr = (struct sockaddr_l2tpip6 *) uaddr;
 	struct net *net = sock_net(sk);
 	__be32 v4addr = 0;
+	int bound_dev_if;
 	int addr_type;
 	int err;
 
@@ -285,13 +286,6 @@ static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	if (addr_type & IPV6_ADDR_MULTICAST)
 		return -EADDRNOTAVAIL;
 
-	err = -EADDRINUSE;
-	read_lock_bh(&l2tp_ip6_lock);
-	if (__l2tp_ip6_bind_lookup(net, &addr->l2tp_addr,
-				   sk->sk_bound_dev_if, addr->l2tp_conn_id))
-		goto out_in_use;
-	read_unlock_bh(&l2tp_ip6_lock);
-
 	lock_sock(sk);
 
 	err = -EINVAL;
@@ -301,28 +295,25 @@ static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	if (sk->sk_state != TCP_CLOSE)
 		goto out_unlock;
 
+	bound_dev_if = sk->sk_bound_dev_if;
+
 	/* Check if the address belongs to the host. */
 	rcu_read_lock();
 	if (addr_type != IPV6_ADDR_ANY) {
 		struct net_device *dev = NULL;
 
 		if (addr_type & IPV6_ADDR_LINKLOCAL) {
-			if (addr_len >= sizeof(struct sockaddr_in6) &&
-			    addr->l2tp_scope_id) {
-				/* Override any existing binding, if another
-				 * one is supplied by user.
-				 */
-				sk->sk_bound_dev_if = addr->l2tp_scope_id;
-			}
+			if (addr->l2tp_scope_id)
+				bound_dev_if = addr->l2tp_scope_id;
 
 			/* Binding to link-local address requires an
-			   interface */
-			if (!sk->sk_bound_dev_if)
+			 * interface.
+			 */
+			if (!bound_dev_if)
 				goto out_unlock_rcu;
 
 			err = -ENODEV;
-			dev = dev_get_by_index_rcu(sock_net(sk),
-						   sk->sk_bound_dev_if);
+			dev = dev_get_by_index_rcu(sock_net(sk), bound_dev_if);
 			if (!dev)
 				goto out_unlock_rcu;
 		}
@@ -337,13 +328,22 @@ static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	}
 	rcu_read_unlock();
 
-	inet->inet_rcv_saddr = inet->inet_saddr = v4addr;
+	write_lock_bh(&l2tp_ip6_lock);
+	if (__l2tp_ip6_bind_lookup(net, &addr->l2tp_addr, bound_dev_if,
+				   addr->l2tp_conn_id)) {
+		write_unlock_bh(&l2tp_ip6_lock);
+		err = -EADDRINUSE;
+		goto out_unlock;
+	}
+
+	inet->inet_saddr = v4addr;
+	inet->inet_rcv_saddr = v4addr;
+	sk->sk_bound_dev_if = bound_dev_if;
 	sk->sk_v6_rcv_saddr = addr->l2tp_addr;
 	np->saddr = addr->l2tp_addr;
 
 	l2tp_ip6_sk(sk)->conn_id = addr->l2tp_conn_id;
 
-	write_lock_bh(&l2tp_ip6_lock);
 	sk_add_bind_node(sk, &l2tp_ip6_bind_table);
 	sk_del_node_init(sk);
 	write_unlock_bh(&l2tp_ip6_lock);
@@ -356,10 +356,7 @@ static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	rcu_read_unlock();
 out_unlock:
 	release_sock(sk);
-	return err;
 
-out_in_use:
-	read_unlock_bh(&l2tp_ip6_lock);
 	return err;
 }
 
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 net 4/5] l2tp: fix lookup for sockets not bound to a device in l2tp_ip
From: Guillaume Nault @ 2016-11-29 12:09 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman, Chris Elston
In-Reply-To: <cover.1480419658.git.g.nault@alphalink.fr>

When looking up an l2tp socket, we must consider a null netdevice id as
wild card. There are currently two problems caused by
__l2tp_ip_bind_lookup() not considering 'dif' as wild card when set to 0:

  * A socket bound to a device (i.e. with sk->sk_bound_dev_if != 0)
    never receives any packet. Since __l2tp_ip_bind_lookup() is called
    with dif == 0 in l2tp_ip_recv(), sk->sk_bound_dev_if is always
    different from 'dif' so the socket doesn't match.

  * Two sockets, one bound to a device but not the other, can be bound
    to the same address. If the first socket binding to the address is
    the one that is also bound to a device, the second socket can bind
    to the same address without __l2tp_ip_bind_lookup() noticing the
    overlap.

To fix this issue, we need to consider that any null device index, be
it 'sk->sk_bound_dev_if' or 'dif', matches with any other value.
We also need to pass the input device index to __l2tp_ip_bind_lookup()
on reception so that sockets bound to a device never receive packets
from other devices.

This patch fixes l2tp_ip6 in the same way.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 net/l2tp/l2tp_ip.c  | 6 ++++--
 net/l2tp/l2tp_ip6.c | 7 ++++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index b517c33..8938b6b 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -61,7 +61,8 @@ static struct sock *__l2tp_ip_bind_lookup(struct net *net, __be32 laddr, int dif
 		if ((l2tp->conn_id == tunnel_id) &&
 		    net_eq(sock_net(sk), net) &&
 		    !(inet->inet_rcv_saddr && inet->inet_rcv_saddr != laddr) &&
-		    !(sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif))
+		    (!sk->sk_bound_dev_if || !dif ||
+		     sk->sk_bound_dev_if == dif))
 			goto found;
 	}
 
@@ -182,7 +183,8 @@ static int l2tp_ip_recv(struct sk_buff *skb)
 		struct iphdr *iph = (struct iphdr *) skb_network_header(skb);
 
 		read_lock_bh(&l2tp_ip_lock);
-		sk = __l2tp_ip_bind_lookup(net, iph->daddr, 0, tunnel_id);
+		sk = __l2tp_ip_bind_lookup(net, iph->daddr, inet_iif(skb),
+					   tunnel_id);
 		if (!sk) {
 			read_unlock_bh(&l2tp_ip_lock);
 			goto discard;
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 5f2ae61..4a86440 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -73,7 +73,8 @@ static struct sock *__l2tp_ip6_bind_lookup(struct net *net,
 		if ((l2tp->conn_id == tunnel_id) &&
 		    net_eq(sock_net(sk), net) &&
 		    !(addr && ipv6_addr_equal(addr, laddr)) &&
-		    !(sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif))
+		    (!sk->sk_bound_dev_if || !dif ||
+		     sk->sk_bound_dev_if == dif))
 			goto found;
 	}
 
@@ -196,8 +197,8 @@ static int l2tp_ip6_recv(struct sk_buff *skb)
 		struct ipv6hdr *iph = ipv6_hdr(skb);
 
 		read_lock_bh(&l2tp_ip6_lock);
-		sk = __l2tp_ip6_bind_lookup(net, &iph->daddr,
-					    0, tunnel_id);
+		sk = __l2tp_ip6_bind_lookup(net, &iph->daddr, inet6_iif(skb),
+					    tunnel_id);
 		if (!sk) {
 			read_unlock_bh(&l2tp_ip6_lock);
 			goto discard;
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 net 1/5] l2tp: lock socket before checking flags in connect()
From: Guillaume Nault @ 2016-11-29 12:09 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman, Chris Elston
In-Reply-To: <cover.1480419658.git.g.nault@alphalink.fr>

Socket flags aren't updated atomically, so the socket must be locked
while reading the SOCK_ZAPPED flag.

This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch
also brings error handling for __ip6_datagram_connect() failures.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 include/net/ipv6.h  |  2 ++
 net/ipv6/datagram.c |  4 +++-
 net/l2tp/l2tp_ip.c  | 19 ++++++++++++-------
 net/l2tp/l2tp_ip6.c | 16 +++++++++++-----
 4 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 8fed1cd..f11ca83 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -970,6 +970,8 @@ int compat_ipv6_setsockopt(struct sock *sk, int level, int optname,
 int compat_ipv6_getsockopt(struct sock *sk, int level, int optname,
 			   char __user *optval, int __user *optlen);
 
+int __ip6_datagram_connect(struct sock *sk, struct sockaddr *addr,
+			   int addr_len);
 int ip6_datagram_connect(struct sock *sk, struct sockaddr *addr, int addr_len);
 int ip6_datagram_connect_v6_only(struct sock *sk, struct sockaddr *addr,
 				 int addr_len);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 37874e2..ccf4055 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -139,7 +139,8 @@ void ip6_datagram_release_cb(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(ip6_datagram_release_cb);
 
-static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
+int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
+			   int addr_len)
 {
 	struct sockaddr_in6	*usin = (struct sockaddr_in6 *) uaddr;
 	struct inet_sock	*inet = inet_sk(sk);
@@ -252,6 +253,7 @@ static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int a
 out:
 	return err;
 }
+EXPORT_SYMBOL_GPL(__ip6_datagram_connect);
 
 int ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 {
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 982f6c4..1f57094 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -308,21 +308,24 @@ static int l2tp_ip_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
 	struct sockaddr_l2tpip *lsa = (struct sockaddr_l2tpip *) uaddr;
 	int rc;
 
-	if (sock_flag(sk, SOCK_ZAPPED)) /* Must bind first - autobinding does not work */
-		return -EINVAL;
-
 	if (addr_len < sizeof(*lsa))
 		return -EINVAL;
 
 	if (ipv4_is_multicast(lsa->l2tp_addr.s_addr))
 		return -EINVAL;
 
-	rc = ip4_datagram_connect(sk, uaddr, addr_len);
-	if (rc < 0)
-		return rc;
-
 	lock_sock(sk);
 
+	/* Must bind first - autobinding does not work */
+	if (sock_flag(sk, SOCK_ZAPPED)) {
+		rc = -EINVAL;
+		goto out_sk;
+	}
+
+	rc = __ip4_datagram_connect(sk, uaddr, addr_len);
+	if (rc < 0)
+		goto out_sk;
+
 	l2tp_ip_sk(sk)->peer_conn_id = lsa->l2tp_conn_id;
 
 	write_lock_bh(&l2tp_ip_lock);
@@ -330,7 +333,9 @@ static int l2tp_ip_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
 	sk_add_bind_node(sk, &l2tp_ip_bind_table);
 	write_unlock_bh(&l2tp_ip_lock);
 
+out_sk:
 	release_sock(sk);
+
 	return rc;
 }
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 9978d01..af9abff 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -371,9 +371,6 @@ static int l2tp_ip6_connect(struct sock *sk, struct sockaddr *uaddr,
 	int	addr_type;
 	int rc;
 
-	if (sock_flag(sk, SOCK_ZAPPED)) /* Must bind first - autobinding does not work */
-		return -EINVAL;
-
 	if (addr_len < sizeof(*lsa))
 		return -EINVAL;
 
@@ -390,10 +387,18 @@ static int l2tp_ip6_connect(struct sock *sk, struct sockaddr *uaddr,
 			return -EINVAL;
 	}
 
-	rc = ip6_datagram_connect(sk, uaddr, addr_len);
-
 	lock_sock(sk);
 
+	 /* Must bind first - autobinding does not work */
+	if (sock_flag(sk, SOCK_ZAPPED)) {
+		rc = -EINVAL;
+		goto out_sk;
+	}
+
+	rc = __ip6_datagram_connect(sk, uaddr, addr_len);
+	if (rc < 0)
+		goto out_sk;
+
 	l2tp_ip6_sk(sk)->peer_conn_id = lsa->l2tp_conn_id;
 
 	write_lock_bh(&l2tp_ip6_lock);
@@ -401,6 +406,7 @@ static int l2tp_ip6_connect(struct sock *sk, struct sockaddr *uaddr,
 	sk_add_bind_node(sk, &l2tp_ip6_bind_table);
 	write_unlock_bh(&l2tp_ip6_lock);
 
+out_sk:
 	release_sock(sk);
 
 	return rc;
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 net 0/5] l2tp: fixes for l2tp_ip and l2tp_ip6 socket handling
From: Guillaume Nault @ 2016-11-29 12:09 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman, Chris Elston

This series addresses problems found while working on commit 32c231164b76
("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").

The first three patches fix races in socket's connect, recv and bind
operations. The last two ones fix scenarios where l2tp fails to
correctly lookup its userspace sockets.

Apart from the last patch, which is l2tp_ip6 specific, every patch
fixes the same problem in the L2TP IPv4 and IPv6 code.

All problems fixed by this series exist since the creation of the
l2tp_ip and l2tp_ip6 modules.

Changes since v1:
  * Patch #3: fix possible uninitialised use of 'ret' in l2tp_ip_bind().


Guillaume Nault (5):
  l2tp: lock socket before checking flags in connect()
  l2tp: hold socket before dropping lock in l2tp_ip{,6}_recv()
  l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
  l2tp: fix lookup for sockets not bound to a device in l2tp_ip
  l2tp: fix address test in __l2tp_ip6_bind_lookup()

 include/net/ipv6.h  |  2 ++
 net/ipv6/datagram.c |  4 ++-
 net/l2tp/l2tp_ip.c  | 63 ++++++++++++++++++++++--------------------
 net/l2tp/l2tp_ip6.c | 79 ++++++++++++++++++++++++++++-------------------------
 4 files changed, 81 insertions(+), 67 deletions(-)

-- 
2.10.2

^ permalink raw reply

* re
From: Mrs. Maria-Elisabeth Schaeffler @ 2016-11-29 11:42 UTC (permalink / raw)




Did you get my request?

^ permalink raw reply

* [PATCH net-next] net: thunderx: Fix transmit queue timeout issue
From: sunil.kovvuri @ 2016-11-29 11:40 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, linux-arm-kernel, Sunil Goutham

From: Sunil Goutham <sgoutham@cavium.com>

Transmit queue timeout issue is seen in two cases
- Due to a race condition btw setting stop_queue at xmit()
  and checking for stopped_queue in NAPI poll routine, at times
  transmission from a SQ comes to a halt. This is fixed
  by using barriers and also added a check for SQ free descriptors,
  incase SQ is stopped and there are only CQE_RX i.e no CQE_TX.
- Contrary to an assumption, a HW errata where HW doesn't stop transmission
  even though there are not enough CQEs available for a CQE_TX is
  not fixed in T88 pass 2.x. This results in a Qset error with
  'CQ_WR_FULL' stalling transmission. This is fixed by adjusting
  RXQ's  RED levels for CQ level such that there is always enough
  space left for CQE_TXs.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 52 ++++++++++++++++++----
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 24 ++--------
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 15 ++++---
 3 files changed, 54 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 1eacec8..ced1802 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -644,6 +644,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 	struct cmp_queue *cq = &qs->cq[cq_idx];
 	struct cqe_rx_t *cq_desc;
 	struct netdev_queue *txq;
+	struct snd_queue *sq;
 	unsigned int tx_pkts = 0, tx_bytes = 0;
 
 	spin_lock_bh(&cq->lock);
@@ -709,16 +710,20 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 
 done:
 	/* Wakeup TXQ if its stopped earlier due to SQ full */
-	if (tx_done) {
+	sq = &nic->qs->sq[cq_idx];
+	if (tx_done ||
+	    (atomic_read(&sq->free_cnt) >= MIN_SQ_DESC_PER_PKT_XMIT)) {
 		netdev = nic->pnicvf->netdev;
 		txq = netdev_get_tx_queue(netdev,
 					  nicvf_netdev_qidx(nic, cq_idx));
 		if (tx_pkts)
 			netdev_tx_completed_queue(txq, tx_pkts, tx_bytes);
 
-		nic = nic->pnicvf;
+		/* To read updated queue and carrier status */
+		smp_mb();
 		if (netif_tx_queue_stopped(txq) && netif_carrier_ok(netdev)) {
-			netif_tx_start_queue(txq);
+			netif_tx_wake_queue(txq);
+			nic = nic->pnicvf;
 			this_cpu_inc(nic->drv_stats->txq_wake);
 			if (netif_msg_tx_err(nic))
 				netdev_warn(netdev,
@@ -1054,6 +1059,9 @@ static netdev_tx_t nicvf_xmit(struct sk_buff *skb, struct net_device *netdev)
 	struct nicvf *nic = netdev_priv(netdev);
 	int qid = skb_get_queue_mapping(skb);
 	struct netdev_queue *txq = netdev_get_tx_queue(netdev, qid);
+	struct nicvf *snic;
+	struct snd_queue *sq;
+	int tmp;
 
 	/* Check for minimum packet length */
 	if (skb->len <= ETH_HLEN) {
@@ -1061,13 +1069,39 @@ static netdev_tx_t nicvf_xmit(struct sk_buff *skb, struct net_device *netdev)
 		return NETDEV_TX_OK;
 	}
 
-	if (!netif_tx_queue_stopped(txq) && !nicvf_sq_append_skb(nic, skb)) {
+	snic = nic;
+	/* Get secondary Qset's SQ structure */
+	if (qid >= MAX_SND_QUEUES_PER_QS) {
+		tmp = qid / MAX_SND_QUEUES_PER_QS;
+		snic = (struct nicvf *)nic->snicvf[tmp - 1];
+		if (!snic) {
+			netdev_warn(nic->netdev,
+				    "Secondary Qset#%d's ptr not initialized\n",
+				    tmp - 1);
+			dev_kfree_skb(skb);
+			return NETDEV_TX_OK;
+		}
+		qid = qid % MAX_SND_QUEUES_PER_QS;
+	}
+
+	sq = &snic->qs->sq[qid];
+	if (!netif_tx_queue_stopped(txq) &&
+	    !nicvf_sq_append_skb(snic, sq, skb, qid)) {
 		netif_tx_stop_queue(txq);
-		this_cpu_inc(nic->drv_stats->txq_stop);
-		if (netif_msg_tx_err(nic))
-			netdev_warn(netdev,
-				    "%s: Transmit ring full, stopping SQ%d\n",
-				    netdev->name, qid);
+
+		/* Barrier, so that stop_queue visible to other cpus */
+		smp_mb();
+
+		/* Check again, incase another cpu freed descriptors */
+		if (atomic_read(&sq->free_cnt) > MIN_SQ_DESC_PER_PKT_XMIT) {
+			netif_tx_start_queue(txq);
+		} else {
+			this_cpu_inc(nic->drv_stats->txq_stop);
+			if (netif_msg_tx_err(nic))
+				netdev_warn(netdev,
+					    "%s: Transmit ring full, stopping SQ%d\n",
+					    netdev->name, qid);
+		}
 		return NETDEV_TX_BUSY;
 	}
 
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 7b336cd..d2ac133 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1190,30 +1190,12 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
 }
 
 /* Append an skb to a SQ for packet transfer. */
-int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb)
+int nicvf_sq_append_skb(struct nicvf *nic, struct snd_queue *sq,
+			struct sk_buff *skb, u8 sq_num)
 {
 	int i, size;
 	int subdesc_cnt, tso_sqe = 0;
-	int sq_num, qentry;
-	struct queue_set *qs;
-	struct snd_queue *sq;
-
-	sq_num = skb_get_queue_mapping(skb);
-	if (sq_num >= MAX_SND_QUEUES_PER_QS) {
-		/* Get secondary Qset's SQ structure */
-		i = sq_num / MAX_SND_QUEUES_PER_QS;
-		if (!nic->snicvf[i - 1]) {
-			netdev_warn(nic->netdev,
-				    "Secondary Qset#%d's ptr not initialized\n",
-				    i - 1);
-			return 1;
-		}
-		nic = (struct nicvf *)nic->snicvf[i - 1];
-		sq_num = sq_num % MAX_SND_QUEUES_PER_QS;
-	}
-
-	qs = nic->qs;
-	sq = &qs->sq[sq_num];
+	int qentry;
 
 	subdesc_cnt = nicvf_sq_subdesc_required(nic, skb);
 	if (subdesc_cnt > atomic_read(&sq->free_cnt))
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 20511f2..9e21046 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -88,13 +88,13 @@
 
 /* RED and Backpressure levels of CQ for pkt reception
  * For CQ, level is a measure of emptiness i.e 0x0 means full
- * eg: For CQ of size 4K, and for pass/drop levels of 128/96
- * HW accepts pkt if unused CQE >= 2048
- * RED accepts pkt if unused CQE < 2048 & >= 1536
- * DROPs pkts if unused CQE < 1536
+ * eg: For CQ of size 4K, and for pass/drop levels of 160/144
+ * HW accepts pkt if unused CQE >= 2560
+ * RED accepts pkt if unused CQE < 2304 & >= 2560
+ * DROPs pkts if unused CQE < 2304
  */
-#define RQ_PASS_CQ_LVL		128ULL
-#define RQ_DROP_CQ_LVL		96ULL
+#define RQ_PASS_CQ_LVL		160ULL
+#define RQ_DROP_CQ_LVL		144ULL
 
 /* RED and Backpressure levels of RBDR for pkt reception
  * For RBDR, level is a measure of fullness i.e 0x0 means empty
@@ -306,7 +306,8 @@ void nicvf_sq_disable(struct nicvf *nic, int qidx);
 void nicvf_put_sq_desc(struct snd_queue *sq, int desc_cnt);
 void nicvf_sq_free_used_descs(struct net_device *netdev,
 			      struct snd_queue *sq, int qidx);
-int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb);
+int nicvf_sq_append_skb(struct nicvf *nic, struct snd_queue *sq,
+			struct sk_buff *skb, u8 sq_num);
 
 struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx);
 void nicvf_rbdr_task(unsigned long data);
-- 
2.7.4

^ permalink raw reply related

* [PATCH net] cxgb4: Add PCI device ID for new adapter
From: Hariprasad Shenai @ 2016-11-29 11:44 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, swise, nirranjan, Hariprasad Shenai

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
index df1573c4a659..ecf3ccc257bc 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
@@ -168,6 +168,7 @@ CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN
 	CH_PCI_ID_TABLE_FENTRY(0x509a),	/* Custom T520-CR */
 	CH_PCI_ID_TABLE_FENTRY(0x509b),	/* Custom T540-CR LOM */
 	CH_PCI_ID_TABLE_FENTRY(0x509c),	/* Custom T520-CR*/
+	CH_PCI_ID_TABLE_FENTRY(0x509d),	/* Custom T540-CR*/
 
 	/* T6 adapters:
 	 */
-- 
2.3.4

^ permalink raw reply related

* Re: [PATCH net V2] net/sched: pedit: make sure that offset is valid
From: Amir Vadai @ 2016-11-29 11:03 UTC (permalink / raw)
  To: zhuyj
  Cc: David S. Miller, netdev, Cong Wang, Jamal Hadi Salim, Or Gerlitz,
	Hadar Har-Zion, Jiri Pirko
In-Reply-To: <CAD=hENc11HmXdAgZ+CNk54s9vORATOJAr9Ex8yybDcwHK4dSfQ@mail.gmail.com>

On Tue, Nov 29, 2016 at 05:27:05PM +0800, zhuyj wrote:
>  Thanks a lot.
> When will offset become -1?
offset is supplied by userspace. For example iproute2 tc tool.
It is valid to supply a negative value, since offset is relative to the
networking layer, one might want to edit a MAC layer field.

Please use bottom posting when replying to mails...

> 
> On Tue, Nov 29, 2016 at 3:14 PM, Amir Vadai <amir@vadai.me> wrote:
> > On Tue, Nov 29, 2016 at 10:32:05AM +0800, zhuyj wrote:
> >>  +       if (offset > 0 && offset > skb->len)
> >>
> >> offset > skb->len is enough?
> > offset is signed and skb->len is unsigned. Therefore for example if
> > offset=-1 and skb->len=10, the actual comparison is 0xff...>10
> >
> >>
> >> On Mon, Nov 28, 2016 at 6:56 PM, Amir Vadai <amir@vadai.me> wrote:
> >> > Add a validation function to make sure offset is valid:
> >> > 1. Not below skb head (could happen when offset is negative).
> >> > 2. Validate both 'offset' and 'at'.
> >> >
> >> > Signed-off-by: Amir Vadai <amir@vadai.me>
> >> > ---
> >> > Hi Dave,
> >> >
> >> > Please pull to -stable branches.
> >> >
> >> > Changes from V0:
> >> > - Add a validation to the 'at' value (this is used as an offset too)
> >> > - Instead of validating the output of skb_header_pointer(), make sure that the
> >> >         offset is good before calling it.
> >> >
> >> > Thanks,
> >> > Amir
> >> >  net/sched/act_pedit.c | 24 ++++++++++++++++++++----
> >> >  1 file changed, 20 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> >> > index b54d56d4959b..cf9b2fe8eac6 100644
> >> > --- a/net/sched/act_pedit.c
> >> > +++ b/net/sched/act_pedit.c
> >> > @@ -108,6 +108,17 @@ static void tcf_pedit_cleanup(struct tc_action *a, int bind)
> >> >         kfree(keys);
> >> >  }
> >> >
> >> > +static bool offset_valid(struct sk_buff *skb, int offset)
> >> > +{
> >> > +       if (offset > 0 && offset > skb->len)
> >> > +               return false;
> >> > +
> >> > +       if  (offset < 0 && -offset > skb_headroom(skb))
> >> > +               return false;
> >> > +
> >> > +       return true;
> >> > +}
> >> > +
> >> >  static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
> >> >                      struct tcf_result *res)
> >> >  {
> >> > @@ -134,6 +145,11 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
> >> >                         if (tkey->offmask) {
> >> >                                 char *d, _d;
> >> >
> >> > +                               if (!offset_valid(skb, off + tkey->at)) {
> >> > +                                       pr_info("tc filter pedit 'at' offset %d out of bounds\n",
> >> > +                                               off + tkey->at);
> >> > +                                       goto bad;
> >> > +                               }
> >> >                                 d = skb_header_pointer(skb, off + tkey->at, 1,
> >> >                                                        &_d);
> >> >                                 if (!d)
> >> > @@ -146,10 +162,10 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
> >> >                                         " offset must be on 32 bit boundaries\n");
> >> >                                 goto bad;
> >> >                         }
> >> > -                       if (offset > 0 && offset > skb->len) {
> >> > -                               pr_info("tc filter pedit"
> >> > -                                       " offset %d can't exceed pkt length %d\n",
> >> > -                                      offset, skb->len);
> >> > +
> >> > +                       if (!offset_valid(skb, off + offset)) {
> >> > +                               pr_info("tc filter pedit offset %d out of bounds\n",
> >> > +                                       offset);
> >> >                                 goto bad;
> >> >                         }
> >> >
> >> > --
> >> > 2.10.2
> >> >

^ permalink raw reply

* Re: [PATCH v3 3/5] net: asix: Fix AX88772x resume failures
From: Jon Hunter @ 2016-11-29 10:51 UTC (permalink / raw)
  To: allan, robert.foss, freddy, Dean_Jenkins, Mark_Craske, davem,
	ivecera, john.stultz, vpalatin, stephen, grundler, changchias,
	andrew, tremyfr, colin.king, linux-usb, netdev, linux-kernel,
	vpalatin
In-Reply-To: <014301d24a1e$34513fe0$9cf3bfa0$@asix.com.tw>

Hi Allan,

On 29/11/16 08:54, ASIX_Allan [Office] wrote:
> Dear Jon ,
> 
> We can reproduce your issue on x86 Linux kernel 4.9.0-rc system in our site
> and modified the following code can fix this issue. Please let us know if
> you still have problems. Thanks a lot.
> 
> static void ax88772_suspend(struct usbnet *dev)
> {
>         struct asix_common_private *priv = dev->driver_priv;
>         u16 medium;
> 
>         /* Stop MAC operation */
> -       medium = asix_read_medium_status(dev, 0);
> +      medium = asix_read_medium_status(dev, 1);
>         medium &= ~AX_MEDIUM_RE;
> -       asix_write_medium_mode(dev, medium, 0);
> +      asix_write_medium_mode(dev, medium, 1);
> 
>         netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
> -                  asix_read_medium_status(dev, 0));
> +                 asix_read_medium_status(dev, 1));
> 
>         /* Preserve BMCR for restoring */
>         priv->presvd_phy_bmcr =
>                 asix_mdio_read_nopm(dev->net, dev->mii.phy_id, MII_BMCR);
> 
>         /* Preserve ANAR for restoring */
>         priv->presvd_phy_advertise =
>                 asix_mdio_read_nopm(dev->net, dev->mii.phy_id,
> MII_ADVERTISE);
> } 

I gave this a quick test this morning and I can confirm that with the
above change I no longer see the error messages. So feel free to add my ...

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Cheers
Jon

-- 
nvpublic

^ permalink raw reply

* Re: [PATCH net-next] cgroup, bpf: remove unnecessary #include
From: Daniel Borkmann @ 2016-11-29 10:48 UTC (permalink / raw)
  To: Alexei Starovoitov, David S . Miller; +Cc: Daniel Mack, Tejun Heo, netdev
In-Reply-To: <1480145027-3524392-1-git-send-email-ast@fb.com>

On 11/26/2016 08:23 AM, Alexei Starovoitov wrote:
> this #include is unnecessary and brings whole set of
> other headers into cgroup-defs.h. Remove it.
>
> Fixes: 3007098494be ("cgroup: add support for eBPF programs")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

This fixes many build errors in samples/bpf/ due to wrong helper
redefinitions (originating from kernel includes conflicting with
samples' helper declarations).

I don't see it pushed out to net-next yet, so:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [PATCH v3 net-next 2/6] net: mvneta: Use cacheable memory to store the rx buffer virtual address
From: Gregory CLEMENT @ 2016-11-29 10:39 UTC (permalink / raw)
  To: Marcin Wojtas
  Cc: David S. Miller, linux-kernel, netdev, Jisheng Zhang,
	Arnd Bergmann, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Thomas Petazzoni, linux-arm-kernel@lists.infradead.org,
	Nadav Haklai, Dmitri Epshtein, Yelena Krivosheev
In-Reply-To: <CAPv3WKetQf_tqva2jaF1y79EWxDo3=yuvbK3jp2uFOqQSGweyA@mail.gmail.com>

Hi Marcin,
 
 On mar., nov. 29 2016, Marcin Wojtas <mw@semihalf.com> wrote:

> Gregory,
>
> 2016-11-29 11:19 GMT+01:00 Gregory CLEMENT <gregory.clement@free-electrons.com>:
>> Hi Marcin,
>>
>>  On mar., nov. 29 2016, Marcin Wojtas <mw@semihalf.com> wrote:
>>
>>> Hi Gregory,
>>>
>>> Another remark below, sorry for noise.
>>>
>>> 2016-11-29 10:37 GMT+01:00 Gregory CLEMENT <gregory.clement@free-electrons.com>:
>>>> Until now the virtual address of the received buffer were stored in the
>>>> cookie field of the rx descriptor. However, this field is 32-bits only
>>>> which prevents to use the driver on a 64-bits architecture.
>>>>
>>>> With this patch the virtual address is stored in an array not shared with
>>>> the hardware (no more need to use the DMA API). Thanks to this, it is
>>>> possible to use cache contrary to the access of the rx descriptor member.
>>>>
>>>> The change is done in the swbm path only because the hwbm uses the cookie
>>>> field, this also means that currently the hwbm is not usable in 64-bits.
>>>>
>>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
>>>> ---
>>>>  drivers/net/ethernet/marvell/mvneta.c | 93 ++++++++++++++++++++++++----
>>>>  1 file changed, 81 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
>>>> index 1b84f746d748..32b142d0e44e 100644
>>>> --- a/drivers/net/ethernet/marvell/mvneta.c
>>>> +++ b/drivers/net/ethernet/marvell/mvneta.c
>>>> @@ -561,6 +561,9 @@ struct mvneta_rx_queue {
>>>>         u32 pkts_coal;
>>>>         u32 time_coal;
>>>>
>>>> +       /* Virtual address of the RX buffer */
>>>> +       void  **buf_virt_addr;
>>>> +
>>>>         /* Virtual address of the RX DMA descriptors array */
>>>>         struct mvneta_rx_desc *descs;
>>>>
>>>> @@ -1573,10 +1576,14 @@ static void mvneta_tx_done_pkts_coal_set(struct mvneta_port *pp,
>>>>
>>>>  /* Handle rx descriptor fill by setting buf_cookie and buf_phys_addr */
>>>>  static void mvneta_rx_desc_fill(struct mvneta_rx_desc *rx_desc,
>>>> -                               u32 phys_addr, u32 cookie)
>>>> +                               u32 phys_addr, void *virt_addr,
>>>> +                               struct mvneta_rx_queue *rxq)
>>>>  {
>>>> -       rx_desc->buf_cookie = cookie;
>>>> +       int i;
>>>> +
>>>>         rx_desc->buf_phys_addr = phys_addr;
>>>> +       i = rx_desc - rxq->descs;
>>>> +       rxq->buf_virt_addr[i] = virt_addr;
>>>>  }
>>>>
>>>>  /* Decrement sent descriptors counter */
>>>> @@ -1781,7 +1788,8 @@ EXPORT_SYMBOL_GPL(mvneta_frag_free);
>>>>
>>>>  /* Refill processing for SW buffer management */
>>>>  static int mvneta_rx_refill(struct mvneta_port *pp,
>>>> -                           struct mvneta_rx_desc *rx_desc)
>>>> +                           struct mvneta_rx_desc *rx_desc,
>>>> +                           struct mvneta_rx_queue *rxq)
>>>>
>>>>  {
>>>>         dma_addr_t phys_addr;
>>>> @@ -1799,7 +1807,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
>>>>                 return -ENOMEM;
>>>>         }
>>>>
>>>> -       mvneta_rx_desc_fill(rx_desc, phys_addr, (u32)data);
>>>> +       mvneta_rx_desc_fill(rx_desc, phys_addr, data, rxq);
>>>>         return 0;
>>>>  }
>>>>
>>>> @@ -1861,7 +1869,12 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
>>>>
>>>>         for (i = 0; i < rxq->size; i++) {
>>>>                 struct mvneta_rx_desc *rx_desc = rxq->descs + i;
>>>> -               void *data = (void *)rx_desc->buf_cookie;
>>>> +               void *data;
>>>> +
>>>> +               if (!pp->bm_priv)
>>>> +                       data = rxq->buf_virt_addr[i];
>>>> +               else
>>>> +                       data = (void *)(uintptr_t)rx_desc->buf_cookie;
>>>
>>> Dropping packets for HWBM (in fact returning dropped buffers to the
>>> pool) is done a couple of lines above. This point will never be
>>
>> indeed I changed the code at every place the buf_cookie was used and
>> missed the fact that for HWBM this code was never reached.
>>
>>> reached with HWBM enabled (and it's also incorrect).
>>
>> What is incorrect?
>>
>
> Possible dma_unmapping + mvneta_frag_free for buffers in HWBM, when
> dropping packets.

Yes sure, but as you mentioned this code is never reached when HWBM is
enabled. I thought there was other part of the code to fix.

Thanks,

Gregory

>
> Thanks,
> Marcin

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH  v2 12/13] net: ethernet: ti: cpts: calc mult and shift from refclk freq
From: Richard Cochran @ 2016-11-29 10:34 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev-u79uwXL29TY76Z2rM5mHXA, Mugunthan V N,
	Sekhar Nori, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Rob Herring,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Murali Karicheri, Wingman Kwok,
	John Stultz, Thomas Gleixner
In-Reply-To: <20161128230337.6731-13-grygorii.strashko-l0cyMroinI0@public.gmane.org>

On Mon, Nov 28, 2016 at 05:03:36PM -0600, Grygorii Strashko wrote:
> +static void cpts_calc_mult_shift(struct cpts *cpts)
> +{
> +	u64 frac, maxsec, ns;
> +	u32 freq, mult, shift;
> +
> +	freq = clk_get_rate(cpts->refclk);
> +
> +	/* Calc the maximum number of seconds which we can run before
> +	 * wrapping around.
> +	 */
> +	maxsec = cpts->cc.mask;
> +	do_div(maxsec, freq);
> +	if (maxsec > 600 && cpts->cc.mask > UINT_MAX)
> +		maxsec = 600;

The reason for this test is not obvious.  Why check cc.mask against
UINT_MAX?  Please use the comment to explain it.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 net-next 2/6] net: mvneta: Use cacheable memory to store the rx buffer virtual address
From: Marcin Wojtas @ 2016-11-29 10:34 UTC (permalink / raw)
  To: Gregory CLEMENT
  Cc: David S. Miller, linux-kernel, netdev, Jisheng Zhang,
	Arnd Bergmann, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Thomas Petazzoni, linux-arm-kernel@lists.infradead.org,
	Nadav Haklai, Dmitri Epshtein, Yelena Krivosheev
In-Reply-To: <87shqafv07.fsf@free-electrons.com>

Gregory,

2016-11-29 11:19 GMT+01:00 Gregory CLEMENT <gregory.clement@free-electrons.com>:
> Hi Marcin,
>
>  On mar., nov. 29 2016, Marcin Wojtas <mw@semihalf.com> wrote:
>
>> Hi Gregory,
>>
>> Another remark below, sorry for noise.
>>
>> 2016-11-29 10:37 GMT+01:00 Gregory CLEMENT <gregory.clement@free-electrons.com>:
>>> Until now the virtual address of the received buffer were stored in the
>>> cookie field of the rx descriptor. However, this field is 32-bits only
>>> which prevents to use the driver on a 64-bits architecture.
>>>
>>> With this patch the virtual address is stored in an array not shared with
>>> the hardware (no more need to use the DMA API). Thanks to this, it is
>>> possible to use cache contrary to the access of the rx descriptor member.
>>>
>>> The change is done in the swbm path only because the hwbm uses the cookie
>>> field, this also means that currently the hwbm is not usable in 64-bits.
>>>
>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
>>> ---
>>>  drivers/net/ethernet/marvell/mvneta.c | 93 ++++++++++++++++++++++++----
>>>  1 file changed, 81 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
>>> index 1b84f746d748..32b142d0e44e 100644
>>> --- a/drivers/net/ethernet/marvell/mvneta.c
>>> +++ b/drivers/net/ethernet/marvell/mvneta.c
>>> @@ -561,6 +561,9 @@ struct mvneta_rx_queue {
>>>         u32 pkts_coal;
>>>         u32 time_coal;
>>>
>>> +       /* Virtual address of the RX buffer */
>>> +       void  **buf_virt_addr;
>>> +
>>>         /* Virtual address of the RX DMA descriptors array */
>>>         struct mvneta_rx_desc *descs;
>>>
>>> @@ -1573,10 +1576,14 @@ static void mvneta_tx_done_pkts_coal_set(struct mvneta_port *pp,
>>>
>>>  /* Handle rx descriptor fill by setting buf_cookie and buf_phys_addr */
>>>  static void mvneta_rx_desc_fill(struct mvneta_rx_desc *rx_desc,
>>> -                               u32 phys_addr, u32 cookie)
>>> +                               u32 phys_addr, void *virt_addr,
>>> +                               struct mvneta_rx_queue *rxq)
>>>  {
>>> -       rx_desc->buf_cookie = cookie;
>>> +       int i;
>>> +
>>>         rx_desc->buf_phys_addr = phys_addr;
>>> +       i = rx_desc - rxq->descs;
>>> +       rxq->buf_virt_addr[i] = virt_addr;
>>>  }
>>>
>>>  /* Decrement sent descriptors counter */
>>> @@ -1781,7 +1788,8 @@ EXPORT_SYMBOL_GPL(mvneta_frag_free);
>>>
>>>  /* Refill processing for SW buffer management */
>>>  static int mvneta_rx_refill(struct mvneta_port *pp,
>>> -                           struct mvneta_rx_desc *rx_desc)
>>> +                           struct mvneta_rx_desc *rx_desc,
>>> +                           struct mvneta_rx_queue *rxq)
>>>
>>>  {
>>>         dma_addr_t phys_addr;
>>> @@ -1799,7 +1807,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
>>>                 return -ENOMEM;
>>>         }
>>>
>>> -       mvneta_rx_desc_fill(rx_desc, phys_addr, (u32)data);
>>> +       mvneta_rx_desc_fill(rx_desc, phys_addr, data, rxq);
>>>         return 0;
>>>  }
>>>
>>> @@ -1861,7 +1869,12 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
>>>
>>>         for (i = 0; i < rxq->size; i++) {
>>>                 struct mvneta_rx_desc *rx_desc = rxq->descs + i;
>>> -               void *data = (void *)rx_desc->buf_cookie;
>>> +               void *data;
>>> +
>>> +               if (!pp->bm_priv)
>>> +                       data = rxq->buf_virt_addr[i];
>>> +               else
>>> +                       data = (void *)(uintptr_t)rx_desc->buf_cookie;
>>
>> Dropping packets for HWBM (in fact returning dropped buffers to the
>> pool) is done a couple of lines above. This point will never be
>
> indeed I changed the code at every place the buf_cookie was used and
> missed the fact that for HWBM this code was never reached.
>
>> reached with HWBM enabled (and it's also incorrect).
>
> What is incorrect?
>

Possible dma_unmapping + mvneta_frag_free for buffers in HWBM, when
dropping packets.

Thanks,
Marcin

^ permalink raw reply

* Re: net: GPF in eth_header
From: Andrey Konovalov @ 2016-11-29 10:26 UTC (permalink / raw)
  To: syzkaller
  Cc: Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck,
	Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
In-Reply-To: <CACPEyM-4dwvmEvq3uTN_8Y-YosP8qA3cYV+oxkufFOn8M+=hew@mail.gmail.com>

On Sat, Nov 26, 2016 at 9:05 PM, Eric Dumazet <erdlkml@gmail.com> wrote:
>> I actually see multiple places where skb_network_offset() is used as
>> an argument to skb_pull().
>> So I guess every place can potentially be buggy.
>
> Well, I think the intent is to accept a negative number.

I'm not sure that was the intent since it results in a signedness
issue which leads to an out-of-bounds.

A quick grep shows that the same issue can potentially happen in
multiple places across the kernel:

net/ipv6/ip6_output.c:1655: __skb_pull(skb, skb_network_offset(skb));
net/packet/af_packet.c:2043: skb_pull(skb, skb_network_offset(skb));
net/packet/af_packet.c:2165: skb_pull(skb, skb_network_offset(skb));
net/core/neighbour.c:1301: __skb_pull(skb, skb_network_offset(skb));
net/core/neighbour.c:1331: __skb_pull(skb, skb_network_offset(skb));
net/core/dev.c:3157: __skb_pull(skb, skb_network_offset(skb));
net/sched/sch_teql.c:337: __skb_pull(skb, skb_network_offset(skb));
net/sched/sch_atm.c:479: skb_pull(skb, skb_network_offset(skb));
net/ipv4/ip_output.c:1385: __skb_pull(skb, skb_network_offset(skb));
net/ipv4/ip_fragment.c:391: if (!pskb_pull(skb, skb_network_offset(skb) + ihl))
drivers/net/vxlan.c:1440: __skb_pull(reply, skb_network_offset(reply));
drivers/net/vxlan.c:1902: __skb_pull(skb, skb_network_offset(skb));
drivers/net/vrf.c:220: __skb_pull(skb, skb_network_offset(skb));
drivers/net/vrf.c:314: __skb_pull(skb, skb_network_offset(skb));

A similar thing also happened to somebody else (on a receive path!):
https://forums.grsecurity.net/viewtopic.php?f=3&t=4550

Does it make sense to check skb_network_offset() before passing it to
skb_pull() everywhere?

>
> This definitely was assumed by commit e1f165032c8bade authors !
>
> I guess they were using a 32bit kernel for their tests.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* [PATCH v2 net-next 4/4] net: phy: Fix the mdix_ctrl changes
From: Raju Lakkaraju @ 2016-11-29  9:46 UTC (permalink / raw)
  To: netdev; +Cc: f.fainelli, Allan.Nielsen, andrew, Raju Lakkaraju
In-Reply-To: <1480412809-6122-1-git-send-email-Raju.Lakkaraju@microsemi.com>

From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

PHY drivers to have an eth_tp_mdix_ctrl to indicate what is the configured
MDI setting, and read eth_tp_mdi to indicate what is the current status,

Add new parameter mdix_ctrl in phy_device structure and fix driver.

Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
---
 drivers/net/phy/marvell.c   | 4 ++--
 drivers/net/phy/microchip.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index fa31f50..e269262 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -268,7 +268,7 @@ static int marvell_config_aneg(struct phy_device *phydev)
 	if (err < 0)
 		return err;
 
-	err = marvell_set_polarity(phydev, phydev->mdix);
+	err = marvell_set_polarity(phydev, phydev->mdix_ctrl);
 	if (err < 0)
 		return err;
 
@@ -311,7 +311,7 @@ static int m88e1111_config_aneg(struct phy_device *phydev)
 	 */
 	err = phy_write(phydev, MII_BMCR, BMCR_RESET);
 
-	err = marvell_set_polarity(phydev, phydev->mdix);
+	err = marvell_set_polarity(phydev, phydev->mdix_ctrl);
 	if (err < 0)
 		return err;
 
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c
index eb4db22..12825a5 100644
--- a/drivers/net/phy/microchip.c
+++ b/drivers/net/phy/microchip.c
@@ -111,7 +111,7 @@ static void lan88xx_set_mdix(struct phy_device *phydev)
 	int buf;
 	int val;
 
-	switch (phydev->mdix) {
+	switch (phydev->mdix_ctrl) {
 	case ETH_TP_MDI:
 		val = LAN88XX_EXT_MODE_CTRL_MDI_;
 		break;
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v3 net-next 2/6] net: mvneta: Use cacheable memory to store the rx buffer virtual address
From: Gregory CLEMENT @ 2016-11-29 10:19 UTC (permalink / raw)
  To: Marcin Wojtas
  Cc: David S. Miller, linux-kernel, netdev, Jisheng Zhang,
	Arnd Bergmann, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Thomas Petazzoni, linux-arm-kernel@lists.infradead.org,
	Nadav Haklai, Dmitri Epshtein, Yelena Krivosheev
In-Reply-To: <CAPv3WKeFXJA1RjFZ5VoRfBtU+QGUx+AT3LYuoMHamQtNWCCMVA@mail.gmail.com>

Hi Marcin,
 
 On mar., nov. 29 2016, Marcin Wojtas <mw@semihalf.com> wrote:

> Hi Gregory,
>
> Another remark below, sorry for noise.
>
> 2016-11-29 10:37 GMT+01:00 Gregory CLEMENT <gregory.clement@free-electrons.com>:
>> Until now the virtual address of the received buffer were stored in the
>> cookie field of the rx descriptor. However, this field is 32-bits only
>> which prevents to use the driver on a 64-bits architecture.
>>
>> With this patch the virtual address is stored in an array not shared with
>> the hardware (no more need to use the DMA API). Thanks to this, it is
>> possible to use cache contrary to the access of the rx descriptor member.
>>
>> The change is done in the swbm path only because the hwbm uses the cookie
>> field, this also means that currently the hwbm is not usable in 64-bits.
>>
>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
>> ---
>>  drivers/net/ethernet/marvell/mvneta.c | 93 ++++++++++++++++++++++++----
>>  1 file changed, 81 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
>> index 1b84f746d748..32b142d0e44e 100644
>> --- a/drivers/net/ethernet/marvell/mvneta.c
>> +++ b/drivers/net/ethernet/marvell/mvneta.c
>> @@ -561,6 +561,9 @@ struct mvneta_rx_queue {
>>         u32 pkts_coal;
>>         u32 time_coal;
>>
>> +       /* Virtual address of the RX buffer */
>> +       void  **buf_virt_addr;
>> +
>>         /* Virtual address of the RX DMA descriptors array */
>>         struct mvneta_rx_desc *descs;
>>
>> @@ -1573,10 +1576,14 @@ static void mvneta_tx_done_pkts_coal_set(struct mvneta_port *pp,
>>
>>  /* Handle rx descriptor fill by setting buf_cookie and buf_phys_addr */
>>  static void mvneta_rx_desc_fill(struct mvneta_rx_desc *rx_desc,
>> -                               u32 phys_addr, u32 cookie)
>> +                               u32 phys_addr, void *virt_addr,
>> +                               struct mvneta_rx_queue *rxq)
>>  {
>> -       rx_desc->buf_cookie = cookie;
>> +       int i;
>> +
>>         rx_desc->buf_phys_addr = phys_addr;
>> +       i = rx_desc - rxq->descs;
>> +       rxq->buf_virt_addr[i] = virt_addr;
>>  }
>>
>>  /* Decrement sent descriptors counter */
>> @@ -1781,7 +1788,8 @@ EXPORT_SYMBOL_GPL(mvneta_frag_free);
>>
>>  /* Refill processing for SW buffer management */
>>  static int mvneta_rx_refill(struct mvneta_port *pp,
>> -                           struct mvneta_rx_desc *rx_desc)
>> +                           struct mvneta_rx_desc *rx_desc,
>> +                           struct mvneta_rx_queue *rxq)
>>
>>  {
>>         dma_addr_t phys_addr;
>> @@ -1799,7 +1807,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
>>                 return -ENOMEM;
>>         }
>>
>> -       mvneta_rx_desc_fill(rx_desc, phys_addr, (u32)data);
>> +       mvneta_rx_desc_fill(rx_desc, phys_addr, data, rxq);
>>         return 0;
>>  }
>>
>> @@ -1861,7 +1869,12 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
>>
>>         for (i = 0; i < rxq->size; i++) {
>>                 struct mvneta_rx_desc *rx_desc = rxq->descs + i;
>> -               void *data = (void *)rx_desc->buf_cookie;
>> +               void *data;
>> +
>> +               if (!pp->bm_priv)
>> +                       data = rxq->buf_virt_addr[i];
>> +               else
>> +                       data = (void *)(uintptr_t)rx_desc->buf_cookie;
>
> Dropping packets for HWBM (in fact returning dropped buffers to the
> pool) is done a couple of lines above. This point will never be

indeed I changed the code at every place the buf_cookie was used and
missed the fact that for HWBM this code was never reached.

> reached with HWBM enabled (and it's also incorrect).

What is incorrect?

Gregory


>
> Best regards,
> Marcin

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply

* [PATCH v2 net-next 0/4] Adding PHY MDI(X) support
From: Raju Lakkaraju @ 2016-11-29  9:46 UTC (permalink / raw)
  To: netdev; +Cc: f.fainelli, Allan.Nielsen, andrew, Raju Lakkaraju

Hi All,

I updated all review comments which were given by Andrew and Florian.

This series add support for PHY MDI(X), and implement it for MSCC phys.

Tested on Beaglebone Black with VSC 8531 PHY.

Please review.

Thanks and regards
Raju.

---
Change set:
v1:
- Initial patch submit the WoL and MDI-X in single set of patches

v2:
- Split the mdi(x) as signal set of patches.
- Remove the out_unlock as suggested by Andrew.
- Add mdix_ctrl parameter in "phy_device" to handle the user configure
  mdi(x). Proposed implementation accepted by Florian.
- phydev->mdix_ctrl initialize with ETH_TP_MDI_AUTO. Ethernet controller
  never initialize this parameter. 
- Fix the mdix changes in marvell and microchip driver.

Raju Lakkaraju (4):
  net: phy: add mdix_ctrl to hold the user configuration.
  net: phy: update the mdix_ctrl with correct value.
  net: phy: Add mdi(x) support in Microsemi PHYs driver
  net: phy: Fix the mdix_ctrl changes

 drivers/net/phy/marvell.c   |   4 +-
 drivers/net/phy/microchip.c |   2 +-
 drivers/net/phy/mscc.c      | 105 ++++++++++++++++++++++++++++++++++++++++----
 drivers/net/phy/phy.c       |  10 +++--
 include/linux/phy.h         |   1 +
 5 files changed, 107 insertions(+), 15 deletions(-)

-- 
2.7.4

^ permalink raw reply

* Re: [PATCH v3 net-next 2/6] net: mvneta: Use cacheable memory to store the rx buffer virtual address
From: Gregory CLEMENT @ 2016-11-29 10:17 UTC (permalink / raw)
  To: Marcin Wojtas
  Cc: David S. Miller, linux-kernel, netdev, Jisheng Zhang,
	Arnd Bergmann, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Thomas Petazzoni, linux-arm-kernel@lists.infradead.org,
	Nadav Haklai, Dmitri Epshtein, Yelena Krivosheev
In-Reply-To: <CAPv3WKcn27QsqLamzS2QV-RF1GD3bgsYCYD-2q9tpTtqF+eiWQ@mail.gmail.com>

Hi Marcin,
 
 On mar., nov. 29 2016, Marcin Wojtas <mw@semihalf.com> wrote:

> Hi Gregory,
>
> Apparently HWBM had a mistake in implementation, please see below.
>
> 2016-11-29 10:37 GMT+01:00 Gregory CLEMENT <gregory.clement@free-electrons.com>:
>> Until now the virtual address of the received buffer were stored in the
>> cookie field of the rx descriptor. However, this field is 32-bits only
>> which prevents to use the driver on a 64-bits architecture.
>>
>> With this patch the virtual address is stored in an array not shared with
>> the hardware (no more need to use the DMA API). Thanks to this, it is
>> possible to use cache contrary to the access of the rx descriptor member.
>>
>> The change is done in the swbm path only because the hwbm uses the cookie
>> field, this also means that currently the hwbm is not usable in 64-bits.
>>
>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
>> ---
>>  drivers/net/ethernet/marvell/mvneta.c | 93 ++++++++++++++++++++++++----
>>  1 file changed, 81 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
>> index 1b84f746d748..32b142d0e44e 100644
>> --- a/drivers/net/ethernet/marvell/mvneta.c
>> +++ b/drivers/net/ethernet/marvell/mvneta.c
>> @@ -561,6 +561,9 @@ struct mvneta_rx_queue {
>>         u32 pkts_coal;
>>         u32 time_coal;
>>
>> +       /* Virtual address of the RX buffer */
>> +       void  **buf_virt_addr;
>> +
>>         /* Virtual address of the RX DMA descriptors array */
>>         struct mvneta_rx_desc *descs;
>>
>> @@ -1573,10 +1576,14 @@ static void mvneta_tx_done_pkts_coal_set(struct mvneta_port *pp,
>>
>>  /* Handle rx descriptor fill by setting buf_cookie and buf_phys_addr */
>>  static void mvneta_rx_desc_fill(struct mvneta_rx_desc *rx_desc,
>> -                               u32 phys_addr, u32 cookie)
>> +                               u32 phys_addr, void *virt_addr,
>> +                               struct mvneta_rx_queue *rxq)
>>  {
>> -       rx_desc->buf_cookie = cookie;
>> +       int i;
>> +
>>         rx_desc->buf_phys_addr = phys_addr;
>> +       i = rx_desc - rxq->descs;
>> +       rxq->buf_virt_addr[i] = virt_addr;
>>  }
>>
>>  /* Decrement sent descriptors counter */
>> @@ -1781,7 +1788,8 @@ EXPORT_SYMBOL_GPL(mvneta_frag_free);
>>
>>  /* Refill processing for SW buffer management */
>>  static int mvneta_rx_refill(struct mvneta_port *pp,
>> -                           struct mvneta_rx_desc *rx_desc)
>> +                           struct mvneta_rx_desc *rx_desc,
>> +                           struct mvneta_rx_queue *rxq)
>>
>>  {
>>         dma_addr_t phys_addr;
>> @@ -1799,7 +1807,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
>>                 return -ENOMEM;
>>         }
>>
>> -       mvneta_rx_desc_fill(rx_desc, phys_addr, (u32)data);
>> +       mvneta_rx_desc_fill(rx_desc, phys_addr, data, rxq);
>>         return 0;
>>  }
>>
>> @@ -1861,7 +1869,12 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
>>
>>         for (i = 0; i < rxq->size; i++) {
>>                 struct mvneta_rx_desc *rx_desc = rxq->descs + i;
>> -               void *data = (void *)rx_desc->buf_cookie;
>> +               void *data;
>> +
>> +               if (!pp->bm_priv)
>> +                       data = rxq->buf_virt_addr[i];
>> +               else
>> +                       data = (void *)(uintptr_t)rx_desc->buf_cookie;
>>
>>                 dma_unmap_single(pp->dev->dev.parent, rx_desc->buf_phys_addr,
>>                                  MVNETA_RX_BUF_SIZE(pp->pkt_size), DMA_FROM_DEVICE);
>> @@ -1894,12 +1907,13 @@ static int mvneta_rx_swbm(struct mvneta_port *pp, int rx_todo,
>>                 unsigned char *data;
>>                 dma_addr_t phys_addr;
>>                 u32 rx_status, frag_size;
>> -               int rx_bytes, err;
>> +               int rx_bytes, err, index;
>>
>>                 rx_done++;
>>                 rx_status = rx_desc->status;
>>                 rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE);
>> -               data = (unsigned char *)rx_desc->buf_cookie;
>> +               index = rx_desc - rxq->descs;
>> +               data = (unsigned char *)rxq->buf_virt_addr[index];
>>                 phys_addr = rx_desc->buf_phys_addr;
>>
>>                 if (!mvneta_rxq_desc_is_first_last(rx_status) ||
>> @@ -1938,7 +1952,7 @@ static int mvneta_rx_swbm(struct mvneta_port *pp, int rx_todo,
>>                 }
>>
>>                 /* Refill processing */
>> -               err = mvneta_rx_refill(pp, rx_desc);
>> +               err = mvneta_rx_refill(pp, rx_desc, rxq);
>>                 if (err) {
>>                         netdev_err(dev, "Linux processing - Can't refill\n");
>>                         rxq->missed++;
>> @@ -2020,7 +2034,7 @@ static int mvneta_rx_hwbm(struct mvneta_port *pp, int rx_todo,
>>                 rx_done++;
>>                 rx_status = rx_desc->status;
>>                 rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE);
>> -               data = (unsigned char *)rx_desc->buf_cookie;
>> +               data = (u8 *)(uintptr_t)rx_desc->buf_cookie;
>>                 phys_addr = rx_desc->buf_phys_addr;
>>                 pool_id = MVNETA_RX_GET_BM_POOL_ID(rx_desc);
>>                 bm_pool = &pp->bm_priv->bm_pools[pool_id];
>> @@ -2708,6 +2722,56 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
>>         return rx_done;
>>  }
>>
>> +/* Refill processing for HW buffer management */
>> +static int mvneta_rx_hwbm_refill(struct mvneta_port *pp,
>> +                                struct mvneta_rx_desc *rx_desc)
>> +
>> +{
>> +       dma_addr_t phys_addr;
>> +       void *data;
>> +
>> +       data = mvneta_frag_alloc(pp->frag_size);
>> +       if (!data)
>> +               return -ENOMEM;
>> +
>> +       phys_addr = dma_map_single(pp->dev->dev.parent, data,
>> +                                  MVNETA_RX_BUF_SIZE(pp->pkt_size),
>> +                                  DMA_FROM_DEVICE);
>> +       if (unlikely(dma_mapping_error(pp->dev->dev.parent, phys_addr))) {
>> +               mvneta_frag_free(pp->frag_size, data);
>> +               return -ENOMEM;
>> +       }
>> +
>> +       rx_desc->buf_phys_addr = phys_addr;
>> +       rx_desc->buf_cookie = (uintptr_t)data;
>> +
>> +       return 0;
>> +}
>> +
>> +/* Handle rxq fill: allocates rxq skbs; called when initializing a port */
>> +static int mvneta_rxq_bm_fill(struct mvneta_port *pp,
>> +                             struct mvneta_rx_queue *rxq,
>> +                             int num)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < num; i++) {
>> +               memset(rxq->descs + i, 0, sizeof(struct mvneta_rx_desc));
>> +               if (mvneta_rx_hwbm_refill(pp, rxq->descs + i) != 0) {
>> +                       netdev_err(pp->dev, "%s:rxq %d, %d of %d buffs  filled\n",
>> +                                  __func__, rxq->id, i, num);
>> +                       break;
>> +               }
>> +       }
>> +
>> +       /* Add this number of RX descriptors as non occupied (ready to
>> +        * get packets)
>> +        */
>> +       mvneta_rxq_non_occup_desc_add(pp, rxq, i);
>> +
>> +       return i;
>> +}
>> +
>>  /* Handle rxq fill: allocates rxq skbs; called when initializing a port */
>>  static int mvneta_rxq_fill(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
>>                            int num)
>> @@ -2716,7 +2780,7 @@ static int mvneta_rxq_fill(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
>>
>>         for (i = 0; i < num; i++) {
>>                 memset(rxq->descs + i, 0, sizeof(struct mvneta_rx_desc));
>> -               if (mvneta_rx_refill(pp, rxq->descs + i) != 0) {
>> +               if (mvneta_rx_refill(pp, rxq->descs + i, rxq) != 0) {
>>                         netdev_err(pp->dev, "%s:rxq %d, %d of %d buffs  filled\n",
>>                                 __func__, rxq->id, i, num);
>>                         break;
>> @@ -2784,14 +2848,14 @@ static int mvneta_rxq_init(struct mvneta_port *pp,
>>                 mvneta_rxq_buf_size_set(pp, rxq,
>>                                         MVNETA_RX_BUF_SIZE(pp->pkt_size));
>>                 mvneta_rxq_bm_disable(pp, rxq);
>> +               mvneta_rxq_fill(pp, rxq, rxq->size);
>>         } else {
>>                 mvneta_rxq_bm_enable(pp, rxq);
>>                 mvneta_rxq_long_pool_set(pp, rxq);
>>                 mvneta_rxq_short_pool_set(pp, rxq);
>> +               mvneta_rxq_bm_fill(pp, rxq, rxq->size);
>
> Manual filling descriptors with new buffers is redundant. For HWBM,
> all buffers are allocated in mvneta_bm_construct() and in runtime they
> are put into descriptors by hardware. I think it's enough to add here:
> mvneta_rxq_non_occup_desc_add(pp, rxq, rxq->size);
>
> And remove mvneta_rxq_bm_fill and mvneta_rx_hwbm_refill.

You're right. I will do it it will simplify the code.

Thanks,

Gregory

>
> Best regards,
> Marcin

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH v2 10/13] net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs
From: Richard Cochran @ 2016-11-29 10:14 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev-u79uwXL29TY76Z2rM5mHXA, Mugunthan V N,
	Sekhar Nori, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Rob Herring,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Murali Karicheri, Wingman Kwok
In-Reply-To: <20161128230337.6731-11-grygorii.strashko-l0cyMroinI0@public.gmane.org>

On Mon, Nov 28, 2016 at 05:03:34PM -0600, Grygorii Strashko wrote:
> CPTS module and IRQs are always enabled when CPTS is registered,
> before starting overflow check work, and disabled during
> deregistration, when overflow check work has been canceled already.
> So, It doesn't require to (re)enable CPTS module and IRQs in
> cpts_overflow_check().
> 
> Signed-off-by: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>

Acked-by: Richard Cochran <richardcochran-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH  v2 09/13] net: ethernet: ti: cpts: clean up event list if event pool is empty
From: Richard Cochran @ 2016-11-29 10:13 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev, Mugunthan V N, Sekhar Nori, linux-kernel,
	linux-omap, Rob Herring, devicetree, Murali Karicheri,
	Wingman Kwok
In-Reply-To: <20161128230337.6731-10-grygorii.strashko@ti.com>

On Mon, Nov 28, 2016 at 05:03:33PM -0600, Grygorii Strashko wrote:
> From: WingMan Kwok <w-kwok2@ti.com>
> 
> When a CPTS user does not exit gracefully by disabling cpts
> timestamping and leaving a joined multicast group, the system
> continues to receive and timestamps the ptp packets which eventually
> occupy all the event list entries.  When this happns, the added code
> tries to remove some list entries which are expired.
> 
> Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>

This patch belongs earlier in the series, before the re-structuring.
It doesn't depend on the others, AFAICT.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH  v2 08/13] net: ethernet: ti: cpts: move dt props parsing to cpts driver
From: Richard Cochran @ 2016-11-29 10:11 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev, Mugunthan V N, Sekhar Nori, linux-kernel,
	linux-omap, Rob Herring, devicetree, Murali Karicheri,
	Wingman Kwok
In-Reply-To: <20161128230337.6731-9-grygorii.strashko@ti.com>

On Mon, Nov 28, 2016 at 05:03:32PM -0600, Grygorii Strashko wrote:
> +static int cpts_of_parse(struct cpts *cpts, struct device_node *node)
> +{
> +	int ret = -EINVAL;
> +	u32 prop;
> +
> +	if (of_property_read_u32(node, "cpts_clock_mult", &prop))
> +		goto  of_error;
> +	cpts->cc_mult = prop;

Why not set cc.mult here at the same time?

> +
> +	if (of_property_read_u32(node, "cpts_clock_shift", &prop))
> +		goto  of_error;
> +	cpts->cc.shift = prop;
> +
> +	return 0;
> +
> +of_error:
> +	dev_err(cpts->dev, "CPTS: Missing property in the DT.\n");
> +	return ret;
> +}

Thanks,
Richard

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox