Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 3/5 net-next] inet: don't check for bind conflicts twice when searching for a port
From: Hannes Frederic Sowa @ 2016-12-21 15:08 UTC (permalink / raw)
  To: Josef Bacik, davem, kraigatgoog, eric.dumazet, tom, netdev,
	kernel-team
In-Reply-To: <1482264424-15439-4-git-send-email-jbacik@fb.com>

On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -92,7 +92,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
>  {
>  	bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
>  	struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
> -	int ret = 1, attempts = 5, port = snum;
> +	int ret = 1, port = snum;
>  	struct inet_bind_hashbucket *head;
>  	struct net *net = sock_net(sk);
>  	int i, low, high, attempt_half;
> @@ -100,6 +100,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
>  	kuid_t uid = sock_i_uid(sk);
>  	u32 remaining, offset;
>  	bool reuseport_ok = !!snum;
> +	bool empty_tb = true;
>  
>  	if (port) {
>  		head = &hinfo->bhash[inet_bhashfn(net, port,
> @@ -111,7 +112,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
>  
>  		goto tb_not_found;
>  	}
> -again:
>  	attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
>  other_half_scan:
>  	inet_get_local_port_range(net, &low, &high);
> @@ -148,8 +148,12 @@ other_parity_scan:
>  		spin_lock_bh(&head->lock);
>  		inet_bind_bucket_for_each(tb, &head->chain)
>  			if (net_eq(ib_net(tb), net) && tb->port == port) {
> -				if (!inet_csk_bind_conflict(sk, tb, false, reuseport_ok))
> -					goto tb_found;
> +				if (hlist_empty(&tb->owners))
> +					goto success;
> +				if (!inet_csk_bind_conflict(sk, tb, false, reuseport_ok)) {
> +					empty_tb = false;
> +					goto success;
> +				}
>  				goto next_port;
>  			}
>  		goto tb_not_found;
> @@ -184,23 +188,12 @@ tb_found:
>  		      !rcu_access_pointer(sk->sk_reuseport_cb) &&
>  		      sk->sk_reuseport && uid_eq(tb->fastuid, uid)))
>  			goto success;
> -		if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok)) {
> -			if ((reuse ||
> -			     (tb->fastreuseport > 0 &&
> -			      sk->sk_reuseport &&
> -			      !rcu_access_pointer(sk->sk_reuseport_cb) &&
> -			      uid_eq(tb->fastuid, uid))) && !snum &&
> -			    --attempts >= 0) {
> -				spin_unlock_bh(&head->lock);
> -				goto again;
> -			}
> +		if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok))
>  			goto fail_unlock;
> -		}
> -		if (!reuse)
> -			tb->fastreuse = 0;
> -		if (!sk->sk_reuseport || !uid_eq(tb->fastuid, uid))
> -			tb->fastreuseport = 0;
> -	} else {
> +		empty_tb = false;
> +	}
> +success:
> +	if (empty_tb) {


I would fine it even more simple to read, if you redo the hlist_empty
check here instead of someone has to review all the paths where this
might get set. hlist_empty is a very quick test.


Thanks,
Hannes

^ permalink raw reply

* Re: [PATCH 1/5 net-next] inet: replace ->bind_conflict with ->rcv_saddr_equal
From: Hannes Frederic Sowa @ 2016-12-21 15:06 UTC (permalink / raw)
  To: Josef Bacik, davem, kraigatgoog, eric.dumazet, tom, netdev,
	kernel-team
In-Reply-To: <1482264424-15439-2-git-send-email-jbacik@fb.com>

On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
> The only difference between inet6_csk_bind_conflict and inet_csk_bind_conflict
> is how they check the rcv_saddr.  Since we want to be able to check the saddr in
> other places just drop the protocol specific ->bind_conflict and replace it with
> ->rcv_saddr_equal, then make inet_csk_bind_conflict the one true bind conflict
> function.
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> 



> ---
>  include/net/inet6_connection_sock.h |  5 -----
>  include/net/inet_connection_sock.h  |  9 +++------
>  net/dccp/ipv4.c                     |  3 ++-
>  net/dccp/ipv6.c                     |  2 +-
>  net/ipv4/inet_connection_sock.c     | 22 +++++++-------------
>  net/ipv4/tcp_ipv4.c                 |  3 ++-
>  net/ipv4/udp.c                      |  1 +
>  net/ipv6/inet6_connection_sock.c    | 40 -------------------------------------
>  net/ipv6/tcp_ipv6.c                 |  4 ++--
>  9 files changed, 18 insertions(+), 71 deletions(-)
> 
> diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
> index 3212b39..8ec87b6 100644
> --- a/include/net/inet6_connection_sock.h
> +++ b/include/net/inet6_connection_sock.h
> @@ -15,16 +15,11 @@
>  
>  #include <linux/types.h>
>  
> -struct inet_bind_bucket;
>  struct request_sock;
>  struct sk_buff;
>  struct sock;
>  struct sockaddr;
>  
> -int inet6_csk_bind_conflict(const struct sock *sk,
> -			    const struct inet_bind_bucket *tb, bool relax,
> -			    bool soreuseport_ok);
> -
>  struct dst_entry *inet6_csk_route_req(const struct sock *sk, struct flowi6 *fl6,
>  				      const struct request_sock *req, u8 proto);
>  
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index ec0479a..9cd43c5 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -62,9 +62,9 @@ struct inet_connection_sock_af_ops {
>  				char __user *optval, int __user *optlen);
>  #endif
>  	void	    (*addr2sockaddr)(struct sock *sk, struct sockaddr *);
> -	int	    (*bind_conflict)(const struct sock *sk,
> -				     const struct inet_bind_bucket *tb,
> -				     bool relax, bool soreuseport_ok);
> +	int         (*rcv_saddr_equal)(const struct sock *sk1,
> +				       const struct sock *sk2,
> +				       bool match_wildcard);
>  	void	    (*mtu_reduced)(struct sock *sk);
>  };
>  
> 

The patch looks as a nice code cleanup already!

Have you looked if we can simply have one rcv_saddr_equal for both ipv4
and ipv6 that e.g. uses sk->sk_family instead of function pointers?
This could give us even more possibilities to remove some indirect
functions calls and thus might relieve some cycles?

Thanks,
Hannes

^ permalink raw reply

* RE: [PATCH][V2] qed: fix memory leak of a qed_spq_entry on error failure paths
From: Mintz, Yuval @ 2016-12-21 13:29 UTC (permalink / raw)
  To: Colin King, netdev@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Elior, Ariel, Tayar, Tomer
In-Reply-To: <20161220114424.11244-1-colin.king@canonical.com>

> From: Colin Ian King <colin.king@canonical.com>
> 
> A qed_spq_entry entry is allocated by qed_sp_init_request but is not kfree'd
> if an error occurs, causing a memory leak. Fix this by returning the previously
> allocated spq entry and also setting *pp_ent to NULL to be safe.
> 
> Thanks to Yuval Mintz for suggestions on how to improve my original fix.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

We've given it a more thorough look, and apparently this isn't the correct fix.
So I'll start by saying sorry for making you send this V2 needlessly.

It boils down to the fact there are two kinds of SPQ entries -
Those originating from the 'free_pool' and those from the 'unlimited_pending'.
Only those originating from the free_pool should be returned
using the qed_spq_return_entry(), as only those actually point to a valid
dma-mapped memory where FW expects to find the entries;
Returning the other kind would lead to assertions later,
as driver would post a ramrod to FW which actually points to address 0.

Looking at the error flows, it seems possible this isn't the only faulty
error flow in the SPQ. I suggest you'd drop this and we'll take it from
here [although if you really have the urge to continue - please do].

Thanks,
Yuval



^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: Jason A. Donenfeld @ 2016-12-21 14:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: George Spelvin, Theodore Ts'o, Andi Kleen, David Miller,
	David Laight, Daniel J . Bernstein, Eric Biggers,
	Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Vegard Nossum
In-Reply-To: <1482298164.8944.8.camel@edumazet-glaptop3.roam.corp.google.com>

Hi Eric,

I computed performance numbers for both 32-bit and 64-bit using the
actual functions in which talking about replacing MD5 with SipHash.
The basic harness is here [1] if you're curious. SipHash was a pretty
clear winner for both cases.

x86_64:
[    1.714302] secure_tcpv6_sequence_number_md5# cycles: 102373398
[    1.747685] secure_tcp_sequence_number_md5# cycles: 92042258
[    1.773522] secure_tcpv6_sequence_number_siphash# cycles: 70786533
[    1.798701] secure_tcp_sequence_number_siphash# cycles: 68941043

x86:
[    1.635749] secure_tcpv6_sequence_number_md5# cycles: 106016335
[    1.670259] secure_tcp_sequence_number_md5# cycles: 95670512
[    1.708387] secure_tcpv6_sequence_number_siphash# cycles: 105988635
[    1.740264] secure_tcp_sequence_number_siphash# cycles: 88225395

>>> 102373398 > 70786533
True
>>> 92042258 > 68941043
True
>>> 106016335 > 105988635
True
>>> 95670512 > 88225395
True

While MD5 is probably faster for some kind of large-data
cycles-per-byte, due to its 64-byte internal state, SipHash -- the
"Sip" part standing "Short Input PRF" -- is fast for shorter inputs.
In practice with the functions we're talking about replacing, there's
no need to hash 64-bytes. So, SipHash comes out faster and more
secure.

I also haven't begun to look focusedly at the assembly my SipHash
implemention is generating, which means there's still window for even
more performance improvements.

Jason


[1] https://git.zx2c4.com/linux-dev/tree/net/core/secure_seq.c?h=siphash-bench#n194

^ permalink raw reply

* Re: [PATCH 2/2] net: sfc: falcon: use new api ethtool_{get|set}_link_ksettings
From: Bert Kenward @ 2016-12-21 14:38 UTC (permalink / raw)
  To: Philippe Reynes, linux-net-drivers, ecree, davem, andrew
  Cc: netdev, linux-kernel
In-Reply-To: <1482272667-1206-2-git-send-email-tremyfr@gmail.com>

On 20/12/16 22:24, Philippe Reynes wrote:
> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
> ---
>  drivers/net/ethernet/sfc/falcon/efx.c          |    2 +-
>  drivers/net/ethernet/sfc/falcon/ethtool.c      |   35 ++++++++++++-------
>  drivers/net/ethernet/sfc/falcon/mdio_10g.c     |   44 +++++++++++++++---------
>  drivers/net/ethernet/sfc/falcon/mdio_10g.h     |    3 +-
>  drivers/net/ethernet/sfc/falcon/net_driver.h   |   12 +++---
>  drivers/net/ethernet/sfc/falcon/qt202x_phy.c   |    9 +++--
>  drivers/net/ethernet/sfc/falcon/tenxpress.c    |   22 ++++++------
>  drivers/net/ethernet/sfc/falcon/txc43128_phy.c |    9 +++--
>  8 files changed, 80 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sfc/falcon/efx.c b/drivers/net/ethernet/sfc/falcon/efx.c
> index 5c5cb3c..438ef9e 100644
> --- a/drivers/net/ethernet/sfc/falcon/efx.c
> +++ b/drivers/net/ethernet/sfc/falcon/efx.c
> @@ -986,7 +986,7 @@ void ef4_mac_reconfigure(struct ef4_nic *efx)
>  
>  /* Push loopback/power/transmit disable settings to the PHY, and reconfigure
>   * the MAC appropriately. All other PHY configuration changes are pushed
> - * through phy_op->set_settings(), and pushed asynchronously to the MAC
> + * through phy_op->set_link_ksettings(), and pushed asynchronously to the MAC
>   * through ef4_monitor().
>   *
>   * Callers must hold the mac_lock
> diff --git a/drivers/net/ethernet/sfc/falcon/ethtool.c b/drivers/net/ethernet/sfc/falcon/ethtool.c
> index 8e1929b..659ece7 100644
> --- a/drivers/net/ethernet/sfc/falcon/ethtool.c
> +++ b/drivers/net/ethernet/sfc/falcon/ethtool.c
> @@ -115,44 +115,53 @@ static int ef4_ethtool_phys_id(struct net_device *net_dev,
>  }
>  
>  /* This must be called with rtnl_lock held. */
> -static int ef4_ethtool_get_settings(struct net_device *net_dev,
> -				    struct ethtool_cmd *ecmd)
> +static int
> +ef4_ethtool_get_link_ksettings(struct net_device *net_dev,
> +			       struct ethtool_link_ksettings *cmd)
>  {
>  	struct ef4_nic *efx = netdev_priv(net_dev);
>  	struct ef4_link_state *link_state = &efx->link_state;
> +	u32 supported;
> +
> +	ethtool_convert_link_mode_to_legacy_u32(&supported,
> +						cmd->link_modes.supported);
>  
>  	mutex_lock(&efx->mac_lock);
> -	efx->phy_op->get_settings(efx, ecmd);
> +	efx->phy_op->get_link_ksettings(efx, cmd);
>  	mutex_unlock(&efx->mac_lock);
>  
>  	/* Both MACs support pause frames (bidirectional and respond-only) */
> -	ecmd->supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
> +	supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
>  
>  	if (LOOPBACK_INTERNAL(efx)) {
> -		ethtool_cmd_speed_set(ecmd, link_state->speed);
> -		ecmd->duplex = link_state->fd ? DUPLEX_FULL : DUPLEX_HALF;
> +		cmd->base.speed = link_state->speed;
> +		cmd->base.duplex = link_state->fd ? DUPLEX_FULL : DUPLEX_HALF;
>  	}
>  
> +	ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
> +						supported);
> +
>  	return 0;
>  }

This translates cmd->link_modes.supported in to the legacy supported
bitmap before we've actually read the values from the phy. Given we're
only using the legacy bitmap so we can add the pause frame settings, I
suggest something like:

ef4_ethtool_get_link_ksettings(struct net_device *net_dev,
			       struct ethtool_link_ksettings *cmd)
{
	struct ef4_nic *efx = netdev_priv(net_dev);
	struct ef4_link_state *link_state = &efx->link_state;

	mutex_lock(&efx->mac_lock);
	efx->phy_op->get_link_ksettings(efx, cmd);
	mutex_unlock(&efx->mac_lock);

	/* Both MACs support pause frames (bidirectional and respond-only) */
	ethtool_link_ksettings_add_link_mode(cmd, supported, Pause);
	ethtool_link_ksettings_add_link_mode(cmd, supported, Asym_Pause);

	if (LOOPBACK_INTERNAL(efx)) {
		cmd->base.speed = link_state->speed;
		cmd->base.duplex = link_state->fd ? DUPLEX_FULL : DUPLEX_HALF;
	}

	return 0;
}


Thanks,

Bert.

^ permalink raw reply

* Re: [PATCH][V2] qed: fix memory leak of a qed_spq_entry on error failure paths
From: Colin Ian King @ 2016-12-21 14:28 UTC (permalink / raw)
  To: Mintz, Yuval, netdev@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Elior, Ariel, Tayar, Tomer
In-Reply-To: <BL2PR07MB230615AAE3CF67DEBC6083778D930@BL2PR07MB2306.namprd07.prod.outlook.com>

On 21/12/16 13:29, Mintz, Yuval wrote:
>> From: Colin Ian King <colin.king@canonical.com>
>>
>> A qed_spq_entry entry is allocated by qed_sp_init_request but is not kfree'd
>> if an error occurs, causing a memory leak. Fix this by returning the previously
>> allocated spq entry and also setting *pp_ent to NULL to be safe.
>>
>> Thanks to Yuval Mintz for suggestions on how to improve my original fix.
>>
>> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> 
> We've given it a more thorough look, and apparently this isn't the correct fix.
> So I'll start by saying sorry for making you send this V2 needlessly.
> 
> It boils down to the fact there are two kinds of SPQ entries -
> Those originating from the 'free_pool' and those from the 'unlimited_pending'.
> Only those originating from the free_pool should be returned
> using the qed_spq_return_entry(), as only those actually point to a valid
> dma-mapped memory where FW expects to find the entries;
> Returning the other kind would lead to assertions later,
> as driver would post a ramrod to FW which actually points to address 0.
> 
> Looking at the error flows, it seems possible this isn't the only faulty
> error flow in the SPQ. I suggest you'd drop this and we'll take it from
> here [although if you really have the urge to continue - please do].
> 
> Thanks,
> Yuval
> 
> 
Sure, lets drop my fixes, I'm out of time on this for 2016 anyhow.

Colin

^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: Jason A. Donenfeld @ 2016-12-21 14:24 UTC (permalink / raw)
  To: George Spelvin
  Cc: Eric Dumazet, Andi Kleen, David Miller, David Laight,
	Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
	Jean-Philippe Aumasson, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, Vegard Nossum
In-Reply-To: <20161221063412.6425.qmail@ns.sciencehorizons.net>

Hi George,

On Wed, Dec 21, 2016 at 7:34 AM, George Spelvin
<linux@sciencehorizons.net> wrote:
> In fact, I have an idea.  Allow me to make the following concrete
> suggestion for using HalfSipHash with 128 bits of key material:
>
> - 64 bits are used as the key.
> - The other 64 bits are used as an IV which is prepended to
>   the message to be hashed.
>
> As a matter of practical implementation, we precompute the effect
> of hashing the IV and store the 128-bit HalfSipHash state, which
> is used just like a 128-bit key.
>
> Because of the way it is constructed, it is obviously no weaker than
> standard HalfSipHash's 64-bit security claim.
>
> I don't know the security of this, and it's almost certainly weaker than
> 128 bits, but I *hope* it's at least a few bits stronger than 64 bits.
> 80 would be enough to dissuade any attacker without a six-figure budget
> (that's per attack, not a one-time capital investment).  96 would be
> ample for our purposes.
>
> What I do know is that it makes a brute-force attack without
> significant cryptanalytic effort impossible.

Depends who's doing the cryptanalytic effort I guess.

Please don't roll your own crypto. It's a dangerous road. Putting
homebrew crypto into the kernel would be an error. Let's stick with
the constructions and security margins that the cryptographers give
us. JP made that fairly clear, I thought.

There are already people working on this problem who undergo peer
review and a career devoted to solving these problems. One result for
small systems that need 128-bit security is Chaskey, which you can go
read about if you're curious.

Jason

^ permalink raw reply

* Re: [PATCHv2 net 1/2] sctp: reduce indent level in sctp_copy_local_addr_list
From: Neil Horman @ 2016-12-21 14:16 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, davem, Marcelo Ricardo Leitner
In-Reply-To: <b4d6428c565955e9e81e3b4623a6458e7567e593.1482212764.git.lucien.xin@gmail.com>

On Tue, Dec 20, 2016 at 01:49:49PM +0800, Xin Long wrote:
> This patch is to reduce indent level by using continue when the addr
> is not allowed, and also drop end_copy by using break.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  net/sctp/protocol.c | 37 +++++++++++++++++++------------------
>  1 file changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 7b523e3..da5d82b 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -205,26 +205,27 @@ int sctp_copy_local_addr_list(struct net *net, struct sctp_bind_addr *bp,
>  	list_for_each_entry_rcu(addr, &net->sctp.local_addr_list, list) {
>  		if (!addr->valid)
>  			continue;
> -		if (sctp_in_scope(net, &addr->a, scope)) {
> -			/* Now that the address is in scope, check to see if
> -			 * the address type is really supported by the local
> -			 * sock as well as the remote peer.
> -			 */
> -			if ((((AF_INET == addr->a.sa.sa_family) &&
> -			      (copy_flags & SCTP_ADDR4_PEERSUPP))) ||
> -			    (((AF_INET6 == addr->a.sa.sa_family) &&
> -			      (copy_flags & SCTP_ADDR6_ALLOWED) &&
> -			      (copy_flags & SCTP_ADDR6_PEERSUPP)))) {
> -				error = sctp_add_bind_addr(bp, &addr->a,
> -						    sizeof(addr->a),
> -						    SCTP_ADDR_SRC, GFP_ATOMIC);
> -				if (error)
> -					goto end_copy;
> -			}
> -		}
> +		if (!sctp_in_scope(net, &addr->a, scope))
> +			continue;
> +
> +		/* Now that the address is in scope, check to see if
> +		 * the address type is really supported by the local
> +		 * sock as well as the remote peer.
> +		 */
> +		if (addr->a.sa.sa_family == AF_INET &&
> +		    !(copy_flags & SCTP_ADDR4_PEERSUPP))
> +			continue;
> +		if (addr->a.sa.sa_family == AF_INET6 &&
> +		    (!(copy_flags & SCTP_ADDR6_ALLOWED) ||
> +		     !(copy_flags & SCTP_ADDR6_PEERSUPP)))
> +			continue;
> +
> +		error = sctp_add_bind_addr(bp, &addr->a, sizeof(addr->a),
> +					   SCTP_ADDR_SRC, GFP_ATOMIC);
> +		if (error)
> +			break;
>  	}
>  
> -end_copy:
>  	rcu_read_unlock();
>  	return error;
>  }
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Series
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: [PATCH RFC net-next 3/7] sctp: add dst_pending_confirm flag
From: Neil Horman @ 2016-12-21 14:11 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev, linux-sctp, YueHaibing
In-Reply-To: <1482094596-26046-4-git-send-email-ja@ssi.bg>

On Sun, Dec 18, 2016 at 10:56:32PM +0200, Julian Anastasov wrote:
> Add new transport flag to allow sockets to confirm neighbour.
> When same struct dst_entry can be used for many different
> neighbours we can not use it for pending confirmations.
> The flag is propagated from transport to every packet.
> It is reset when cached dst is reset.
> 
> Reported-by: YueHaibing <yuehaibing@huawei.com>
> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
> Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---
>  include/net/sctp/sctp.h    |  6 ++----
>  include/net/sctp/structs.h |  4 ++++
>  net/sctp/associola.c       |  3 +--
>  net/sctp/output.c          | 10 +++++++++-
>  net/sctp/outqueue.c        |  2 +-
>  net/sctp/sm_make_chunk.c   |  6 ++----
>  net/sctp/sm_sideeffect.c   |  2 +-
>  net/sctp/socket.c          |  4 ++--
>  net/sctp/transport.c       | 18 +++++++++++++++++-
>  9 files changed, 39 insertions(+), 16 deletions(-)
> 
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index f0dcaeb..85d9083 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -586,10 +586,8 @@ static inline void sctp_v4_map_v6(union sctp_addr *addr)
>   */
>  static inline struct dst_entry *sctp_transport_dst_check(struct sctp_transport *t)
>  {
> -	if (t->dst && !dst_check(t->dst, t->dst_cookie)) {
> -		dst_release(t->dst);
> -		t->dst = NULL;
> -	}
> +	if (t->dst && !dst_check(t->dst, t->dst_cookie))
> +		sctp_transport_dst_release(t);
>  
>  	return t->dst;
>  }
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 92daabd..e842e84 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -838,6 +838,8 @@ struct sctp_transport {
>  
>  	__u32 burst_limited;	/* Holds old cwnd when max.burst is applied */
>  
> +	__u32 dst_pending_confirm;	/* need to confirm neighbour */
> +
>  	/* Destination */
>  	struct dst_entry *dst;
>  	/* Source address. */
> @@ -980,6 +982,8 @@ void sctp_transport_route(struct sctp_transport *, union sctp_addr *,
>  void sctp_transport_reset(struct sctp_transport *);
>  void sctp_transport_update_pmtu(struct sock *, struct sctp_transport *, u32);
>  void sctp_transport_immediate_rtx(struct sctp_transport *);
> +void sctp_transport_dst_release(struct sctp_transport *t);
> +void sctp_transport_dst_confirm(struct sctp_transport *t);
>  
>  
>  /* This is the structure we use to queue packets as they come into
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 68428e1..7bd26e0 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -820,8 +820,7 @@ void sctp_assoc_control_transport(struct sctp_association *asoc,
>  		if (transport->state != SCTP_UNCONFIRMED)
>  			transport->state = SCTP_INACTIVE;
>  		else {
> -			dst_release(transport->dst);
> -			transport->dst = NULL;
> +			sctp_transport_dst_release(transport);
>  			ulp_notify = false;
>  		}
>  
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index f5320a8..4684a00 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -550,6 +550,7 @@ int sctp_packet_transmit(struct sctp_packet *packet, gfp_t gfp)
>  	struct sctp_association *asoc = tp->asoc;
>  	struct sctp_chunk *chunk, *tmp;
>  	int pkt_count, gso = 0;
> +	int confirm;
>  	struct dst_entry *dst;
>  	struct sk_buff *head;
>  	struct sctphdr *sh;
> @@ -628,7 +629,14 @@ int sctp_packet_transmit(struct sctp_packet *packet, gfp_t gfp)
>  			asoc->peer.last_sent_to = tp;
>  	}
>  	head->ignore_df = packet->ipfragok;
> -	tp->af_specific->sctp_xmit(head, tp);
> +	confirm = tp->dst_pending_confirm;
> +	if (confirm)
> +		head->dst_pending_confirm = 1;
Why not use your confirm helper function here?

> +	/* neighbour should be confirmed on successful transmission or
> +	 * positive error
> +	 */
> +	if (tp->af_specific->sctp_xmit(head, tp) >= 0 && confirm)
> +		tp->dst_pending_confirm = 0;
>  
>  out:
>  	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
> diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
> index e540826..8234799 100644
> --- a/net/sctp/outqueue.c
> +++ b/net/sctp/outqueue.c
> @@ -1641,7 +1641,7 @@ static void sctp_check_transmitted(struct sctp_outq *q,
>  
>  		if (forward_progress) {
>  			if (transport->dst)
> -				dst_confirm(transport->dst);
> +				sctp_transport_dst_confirm(transport);
>  		}
>  	}
>  
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index 9e9690b..6fb15bf 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -3317,8 +3317,7 @@ static void sctp_asconf_param_success(struct sctp_association *asoc,
>  		local_bh_enable();
>  		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
>  				transports) {
> -			dst_release(transport->dst);
> -			transport->dst = NULL;
> +			sctp_transport_dst_release(transport);
>  		}
>  		break;
>  	case SCTP_PARAM_DEL_IP:
> @@ -3332,8 +3331,7 @@ static void sctp_asconf_param_success(struct sctp_association *asoc,
>  		local_bh_enable();
>  		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
>  				transports) {
> -			dst_release(transport->dst);
> -			transport->dst = NULL;
> +			sctp_transport_dst_release(transport);
>  		}
>  		break;
>  	default:
> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
> index c345bf1..9255b29 100644
> --- a/net/sctp/sm_sideeffect.c
> +++ b/net/sctp/sm_sideeffect.c
> @@ -723,7 +723,7 @@ static void sctp_cmd_transport_on(sctp_cmd_seq_t *cmds,
>  	 * forward progress.
>  	 */
>  	if (t->dst)
> -		dst_confirm(t->dst);
> +		sctp_transport_dst_confirm(t);
>  
>  	/* The receiver of the HEARTBEAT ACK should also perform an
>  	 * RTT measurement for that destination transport address
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 318c678..9ee676a 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -588,7 +588,7 @@ static int sctp_send_asconf_add_ip(struct sock		*sk,
>  			list_for_each_entry(trans,
>  			    &asoc->peer.transport_addr_list, transports) {
>  				/* Clear the source and route cache */
> -				dst_release(trans->dst);
> +				sctp_transport_dst_release(trans);
>  				trans->cwnd = min(4*asoc->pathmtu, max_t(__u32,
>  				    2*asoc->pathmtu, 4380));
>  				trans->ssthresh = asoc->peer.i.a_rwnd;
> @@ -839,7 +839,7 @@ static int sctp_send_asconf_del_ip(struct sock		*sk,
>  		 */
>  		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
>  					transports) {
> -			dst_release(transport->dst);
> +			sctp_transport_dst_release(transport);
>  			sctp_transport_route(transport, NULL,
>  					     sctp_sk(asoc->base.sk));
>  		}
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index ce54dce..db66514 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -227,7 +227,7 @@ void sctp_transport_pmtu(struct sctp_transport *transport, struct sock *sk)
>  {
>  	/* If we don't have a fresh route, look one up */
>  	if (!transport->dst || transport->dst->obsolete) {
> -		dst_release(transport->dst);
> +		sctp_transport_dst_release(transport);
>  		transport->af_specific->get_dst(transport, &transport->saddr,
>  						&transport->fl, sk);
>  	}
> @@ -659,3 +659,19 @@ void sctp_transport_immediate_rtx(struct sctp_transport *t)
>  			sctp_transport_hold(t);
>  	}
>  }
> +
> +/* Drop dst */
> +void sctp_transport_dst_release(struct sctp_transport *t)
> +{
> +	dst_release(t->dst);
> +	t->dst = NULL;
> +	t->dst_pending_confirm = 0;
> +}
> +
> +/* Schedule neighbour confirm */
> +void sctp_transport_dst_confirm(struct sctp_transport *t)
> +{
> +	if (!t->dst_pending_confirm)
> +		t->dst_pending_confirm = 1;
I'm not sure why you check it for being zero before setting it here, just set it
unilaterally instead, or are you trying to avoid dirtying a cacheline?

Neil

> +}
> +
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* [PATCH net] tcp: add a missing barrier in tcp_tasklet_func()
From: Eric Dumazet @ 2016-12-21 13:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet, Madalin Bucur

From: Eric Dumazet <edumazet@google.com>

Madalin reported crashes happening in tcp_tasklet_func() on powerpc64

Before TSQ_QUEUED bit is cleared, we must ensure the changes done
by list_del(&tp->tsq_node); are committed to memory, otherwise
corruption might happen, as an other cpu could catch TSQ_QUEUED
clearance too soon.

We can notice that old kernels were immune to this bug, because
TSQ_QUEUED was cleared after a bh_lock_sock(sk)/bh_unlock_sock(sk)
section, but they could have missed a kick to write additional bytes,
when NIC interrupts for a given flow are spread to multiple cpus.

Affected TCP flows would need an incoming ACK or RTO timer to add more
packets to the pipe. So overall situation should be better now.

Fixes: b223feb9de2a ("tcp: tsq: add shortcut in tcp_tasklet_func()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Madalin Bucur <madalin.bucur@nxp.com>
Tested-by: Madalin Bucur <madalin.bucur@nxp.com>
---
 net/ipv4/tcp_output.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b45101f3d2bd2e0f0077305a061add4f7ea0de27..31a255b555ad86a3537c077862e3ea38f9b44284 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -769,6 +769,7 @@ static void tcp_tasklet_func(unsigned long data)
 		list_del(&tp->tsq_node);
 
 		sk = (struct sock *)tp;
+		smp_mb__before_atomic();
 		clear_bit(TSQ_QUEUED, &sk->sk_tsq_flags);
 
 		if (!sk->sk_lock.owned &&

^ permalink raw reply related

* RE: [PATCH v3] net: macb: Added PCI wrapper for Platform Driver.
From: Bartosz Folta @ 2016-12-21 13:42 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Nicolas Ferre, David S. Miller, Niklas Cassel, Alexandre Torgue,
	Satanand Burla, Raghu Vatsavayi, Simon Horman,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Rafal Ozieblo
In-Reply-To: <CAHp75Vf7VBWVWE-jcM0bv9t7QSnpAG46izpkDpqx7VaSdPgUCw@mail.gmail.com>

> And one more
>
> +       plat_info.dma_mask = DMA_BIT_MASK(32);
>
> Perhaps
>  plat_info.dma_mask = pdev->dma_mask;
> ?

Thank you.

I tested your suggestion. I have sent second version of patch.
https://patchwork.ozlabs.org/patch/707800/

Regards,
Bartosz Folta


^ permalink raw reply

* Re: Soft lockup in tc_classify
From: Shahar Klein @ 2016-12-21 12:58 UTC (permalink / raw)
  To: Daniel Borkmann, Cong Wang
  Cc: shahark, Or Gerlitz, Linux Netdev List, Roi Dayan, David Miller,
	Jiri Pirko, John Fastabend, Hadar Hen Zion
In-Reply-To: <585A564D.4030702@iogearbox.net>



On 12/21/2016 12:15 PM, Daniel Borkmann wrote:
> On 12/21/2016 08:03 AM, Cong Wang wrote:
>> On Tue, Dec 20, 2016 at 10:44 PM, Shahar Klein <shahark@mellanox.com>
>> wrote:
> [...]
>> Looks like you added a debug printk inside tcf_destroy() too,
>> which seems racy with filter creation, it should not happen since
>> in both cases we take RTNL lock.
>>
>> Don't know if changing all RCU_INIT_POINTER in that file to
>> rcu_assign_pointer could help anything or not. Mind to try?
>
> I don't think at this point that it's RCU related at all.
>
> I have a theory on what is happening. Quoting the piece in question from
> Shahar's log:
>
>  1: thread-2845[cpu-1] setting tp_created to 1 tp=ffff94b5b0280780
> back=ffff94b9ea932060
>  2: thread-2856[cpu-1] setting tp_created to 1 tp=ffff94b9ea9322a0
> back=ffff94b9ea932060
>  3: thread-2843[cpu-1] setting tp_created to 1 tp=ffff94b5b402c960
> back=ffff94b9ea932060
>  4: destroy ffff94b5b669fea0 tcf_destroy:1905
>  5: thread-2853[cpu-1] setting tp_created to 1 tp=ffff94b5b02805a0
> back=ffff94b9ea932060
>  6: thread-2853[cpu-1] add/change filter by: fl_get [cls_flower]
> tp=ffff94b5b02805a0 tp->next=ffff94b9ea932060
>  7: destroy ffff94b5b0280780 tcf_destroy:1905
>  8: thread-2845[cpu-1] add/change filter by: fl_get [cls_flower]
> tp=ffff94b5b02805a0 tp->next=ffff94b5b02805a0
>
> The interesting thing is that all this happens on CPU1, so as you say
> we're under rtnl.
> In 1), thread-2845 creates tp=ffff94b5b0280780, which is destroyed in
> 7), presumably also
> by thread-2845, and the weird part is why suddenly in 8) thread-2845
> adds a created filter
> without actually creating it. Plus, thread-2845 got interrupted, which
> means it must have
> dropped rntl in the middle. We drop it in tc_ctl_tfilter() when we do
> tcf_proto_lookup_ops()
> and need to pull in a module, but here this doesn't make sense at all
> since i) at this
> point we haven't created the tp yet and 2) flower was already there.
> Thus the only explanation
> where this must have happened is where we called tp->ops->change(). So
> here the return
> code must have been -EAGAIN, which makes sense because in 7) we
> destroyed that specific
> tp instance. Which means we goto replay but *do not* clear tp_created. I
> think that is
> the bug in question. So, while we dropped rtnl in the meantime, some
> other tp instance
> was added (tp=ffff94b5b02805a0) that we had a match on in round 2, but
> we still think it
> was newly created which wasn't the actual case. So we'd need to deal
> with the fact that
> ->change() callback could return -EAGAIN as well. Now looking at flower,
> I think the call
> chain must have been fl_change() -> fl_set_parms() ->
> tcf_exts_validate() -> tcf_action_init()
> -> tcf_action_init_1(). And here one possibility I see is that
> tc_lookup_action_n()
> failed, therefore we shortly dropped rtnl for the request_module() where
> the module
> got loaded successfully and thus error code from there is -EAGAIN that
> got propagated
> all the way through ->change() from tc_ctl_tfilter(). So it looks like a
> generic issue
> not specifically tied to flower.
>
> Shahar, can you test the following? Thanks!
>
>  net/sched/cls_api.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 3fbba79..1ecdf80 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -148,13 +148,15 @@ static int tc_ctl_tfilter(struct sk_buff *skb,
> struct nlmsghdr *n)
>      unsigned long cl;
>      unsigned long fh;
>      int err;
> -    int tp_created = 0;
> +    int tp_created;
>
>      if ((n->nlmsg_type != RTM_GETTFILTER) &&
>          !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN))
>          return -EPERM;
>
>  replay:
> +    tp_created = 0;
> +
>      err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL);
>      if (err < 0)
>          return err;

Test run successfully!
I'll remove all other debug "fixes" and run again later

Thanks Daniel!

^ permalink raw reply

* Re: Soft lockup in tc_classify
From: Daniel Borkmann @ 2016-12-21 13:18 UTC (permalink / raw)
  To: Shahar Klein, Cong Wang
  Cc: Or Gerlitz, Linux Netdev List, Roi Dayan, David Miller,
	Jiri Pirko, John Fastabend, Hadar Hen Zion
In-Reply-To: <cb55ede9-3e6e-8e68-3005-8bdb2105122e@mellanox.com>

On 12/21/2016 01:58 PM, Shahar Klein wrote:
> On 12/21/2016 12:15 PM, Daniel Borkmann wrote:
>> On 12/21/2016 08:03 AM, Cong Wang wrote:
>>> On Tue, Dec 20, 2016 at 10:44 PM, Shahar Klein <shahark@mellanox.com>
>>> wrote:
>> [...]
>>> Looks like you added a debug printk inside tcf_destroy() too,
>>> which seems racy with filter creation, it should not happen since
>>> in both cases we take RTNL lock.
>>>
>>> Don't know if changing all RCU_INIT_POINTER in that file to
>>> rcu_assign_pointer could help anything or not. Mind to try?
>>
>> I don't think at this point that it's RCU related at all.
>>
>> I have a theory on what is happening. Quoting the piece in question from
>> Shahar's log:
>>
>>  1: thread-2845[cpu-1] setting tp_created to 1 tp=ffff94b5b0280780
>> back=ffff94b9ea932060
>>  2: thread-2856[cpu-1] setting tp_created to 1 tp=ffff94b9ea9322a0
>> back=ffff94b9ea932060
>>  3: thread-2843[cpu-1] setting tp_created to 1 tp=ffff94b5b402c960
>> back=ffff94b9ea932060
>>  4: destroy ffff94b5b669fea0 tcf_destroy:1905
>>  5: thread-2853[cpu-1] setting tp_created to 1 tp=ffff94b5b02805a0
>> back=ffff94b9ea932060
>>  6: thread-2853[cpu-1] add/change filter by: fl_get [cls_flower]
>> tp=ffff94b5b02805a0 tp->next=ffff94b9ea932060
>>  7: destroy ffff94b5b0280780 tcf_destroy:1905
>>  8: thread-2845[cpu-1] add/change filter by: fl_get [cls_flower]
>> tp=ffff94b5b02805a0 tp->next=ffff94b5b02805a0
>>
>> The interesting thing is that all this happens on CPU1, so as you say
>> we're under rtnl.
>> In 1), thread-2845 creates tp=ffff94b5b0280780, which is destroyed in
>> 7), presumably also
>> by thread-2845, and the weird part is why suddenly in 8) thread-2845
>> adds a created filter
>> without actually creating it. Plus, thread-2845 got interrupted, which
>> means it must have
>> dropped rntl in the middle. We drop it in tc_ctl_tfilter() when we do
>> tcf_proto_lookup_ops()
>> and need to pull in a module, but here this doesn't make sense at all
>> since i) at this
>> point we haven't created the tp yet and 2) flower was already there.
>> Thus the only explanation
>> where this must have happened is where we called tp->ops->change(). So
>> here the return
>> code must have been -EAGAIN, which makes sense because in 7) we
>> destroyed that specific
>> tp instance. Which means we goto replay but *do not* clear tp_created. I
>> think that is
>> the bug in question. So, while we dropped rtnl in the meantime, some
>> other tp instance
>> was added (tp=ffff94b5b02805a0) that we had a match on in round 2, but
>> we still think it
>> was newly created which wasn't the actual case. So we'd need to deal
>> with the fact that
>> ->change() callback could return -EAGAIN as well. Now looking at flower,
>> I think the call
>> chain must have been fl_change() -> fl_set_parms() ->
>> tcf_exts_validate() -> tcf_action_init()
>> -> tcf_action_init_1(). And here one possibility I see is that
>> tc_lookup_action_n()
>> failed, therefore we shortly dropped rtnl for the request_module() where
>> the module
>> got loaded successfully and thus error code from there is -EAGAIN that
>> got propagated
>> all the way through ->change() from tc_ctl_tfilter(). So it looks like a
>> generic issue
>> not specifically tied to flower.
>>
>> Shahar, can you test the following? Thanks!
>>
>>  net/sched/cls_api.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 3fbba79..1ecdf80 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -148,13 +148,15 @@ static int tc_ctl_tfilter(struct sk_buff *skb,
>> struct nlmsghdr *n)
>>      unsigned long cl;
>>      unsigned long fh;
>>      int err;
>> -    int tp_created = 0;
>> +    int tp_created;
>>
>>      if ((n->nlmsg_type != RTM_GETTFILTER) &&
>>          !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN))
>>          return -EPERM;
>>
>>  replay:
>> +    tp_created = 0;
>> +
>>      err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL);
>>      if (err < 0)
>>          return err;
>
> Test run successfully!
> I'll remove all other debug "fixes" and run again later
>
> Thanks Daniel!

Great, thanks for confirming so far Shahar!
I'll cook an official patch in the meantime.

^ permalink raw reply

* Re: [PATCH] at86rf230: Allow slow GPIO pins for "rstn"
From: Stefan Schmidt @ 2016-12-21 13:11 UTC (permalink / raw)
  To: Andrey Smirnov, linux-wpan
  Cc: Alexander Aring, netdev, linux-kernel, Chris Healy
In-Reply-To: <1482103533-13187-1-git-send-email-andrew.smirnov@gmail.com>

Hello.

On 19/12/16 00:25, Andrey Smirnov wrote:
> Driver code never touches "rstn" signal in atomic context, so there's
> no need to implicitly put such restriction on it by using gpio_set_value
> to manipulate it. Replace gpio_set_value to gpio_set_value_cansleep to
> fix that.

We need to make sure we are not assuming it can be called  in such a 
context in the future now. But that is something we can worry about if 
it comes up.

> As a an example of where such restriction might be inconvenient,
> consider a hardware design where "rstn" is connected to a pin of I2C/SPI
> GPIO expander chip.

Is this a real life issue you run into?

> Cc: Chris Healy <cphealy@gmail.com>
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
> ---
>  drivers/net/ieee802154/at86rf230.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> index 9f10da6..7700690 100644
> --- a/drivers/net/ieee802154/at86rf230.c
> +++ b/drivers/net/ieee802154/at86rf230.c
> @@ -1710,9 +1710,9 @@ static int at86rf230_probe(struct spi_device *spi)
>  	/* Reset */
>  	if (gpio_is_valid(rstn)) {
>  		udelay(1);
> -		gpio_set_value(rstn, 0);
> +		gpio_set_value_cansleep(rstn, 0);
>  		udelay(1);
> -		gpio_set_value(rstn, 1);
> +		gpio_set_value_cansleep(rstn, 1);
>  		usleep_range(120, 240);
>  	}
>
>


Acked-by: Stefan Schmidt <stefan@osg.samsung.com>

regards
Stefan Schmidt

^ permalink raw reply

* Re: [RFC PATCH 1/7] PF driver modified to enable HW filter support, changes works in backward compatibility mode Enable required things in Makefile Enable LZ4 dependecy inside config file
From: Sunil Kovvuri @ 2016-12-21 13:05 UTC (permalink / raw)
  To: Satha Koteswara Rao
  Cc: LKML, Sunil Goutham, Robert Richter, David S. Miller, David Daney,
	rvatsavayi, derek.chickles, philip.romanov, Linux Netdev List,
	LAKML
In-Reply-To: <1482310011-1862-2-git-send-email-satha.rao@caviumnetworks.com>

>
>  #define NIC_MAX_RSS_HASH_BITS          8
>  #define NIC_MAX_RSS_IDR_TBL_SIZE       (1 << NIC_MAX_RSS_HASH_BITS)
> +#define NIC_TNS_RSS_IDR_TBL_SIZE       5

So you want to use only 5 queues per VF when TNS is enabled, is it ??
There are 4096 RSS indices in total, for each VF you can use max 32.
I guess you wanted to set no of hash bits to 5 instead of table size.

>  #define RSS_HASH_KEY_SIZE              5 /* 320 bit key */
>
>  struct nicvf_rss_info {
> @@ -255,74 +258,6 @@ struct nicvf_drv_stats {
>         struct u64_stats_sync   syncp;
>  };
>
> -struct nicvf {
> -       struct nicvf            *pnicvf;
> -       struct net_device       *netdev;
> -       struct pci_dev          *pdev;
> -       void __iomem            *reg_base;

Didn't get why you moved this structure to the end of file.
Looks like an unnecessary modification.


> +static unsigned int num_vfs;
> +module_param(num_vfs, uint, 0644);
> +MODULE_PARM_DESC(num_vfs, "Non zero positive value, specifies number of VF's per physical port");

So what if driver is built-in instead of module, I can't use TNS is it ?

>
> +/* Set RBDR Backpressure (RBDR_BP) and CQ backpressure (CQ_BP) of vnic queues
> + * to 129 each

Why 129 ??
RBDR minimum size is 8K buffers, why you want to assert BP when still
~4K buffers
are available. Isn't 4K a huge number to start asserting backpressure ?

^ permalink raw reply

* [PATCH v2] net: macb: Applied changes according to review made by Andy Shevchenko.
From: Bartosz Folta @ 2016-12-21 13:04 UTC (permalink / raw)
  To: Nicolas Ferre, David S. Miller, Niklas Cassel, Alexandre Torgue,
	Satanand Burla, Raghu Vatsavayi, Simon Horman,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
  Cc: Bartosz Folta, Rafal Ozieblo
In-Reply-To: <1482325169-22771-1-git-send-email-bfolta@cadence.com>

This patch is sent in regard to recently applied patch
Commit 83a77e9ec4150ee4acc635638f7dedd9da523a26
net: macb: Added PCI wrapper for Platform Driver.

Signed-off-by: Bartosz Folta <bfolta@cadence.com>
---
Changed in v2:
Size of DMA mask is set based on value held by PCI device.
---
 drivers/net/ethernet/cadence/macb_pci.c | 27 ++++++++++-----------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_pci.c b/drivers/net/ethernet/cadence/macb_pci.c
index 92be2cd..9906fda 100644
--- a/drivers/net/ethernet/cadence/macb_pci.c
+++ b/drivers/net/ethernet/cadence/macb_pci.c
@@ -1,5 +1,5 @@
 /**
- * macb_pci.c - Cadence GEM PCI wrapper.
+ * Cadence GEM PCI wrapper.
  *
  * Copyright (C) 2016 Cadence Design Systems - http://www.cadence.com
  *
@@ -45,32 +45,27 @@ static int macb_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	struct macb_platform_data plat_data;
 	struct resource res[2];
 
-	/* sanity check */
-	if (!id)
-		return -EINVAL;
-
 	/* enable pci device */
-	err = pci_enable_device(pdev);
+	err = pcim_enable_device(pdev);
 	if (err < 0) {
-		dev_err(&pdev->dev, "Enabling PCI device has failed: 0x%04X",
-			err);
-		return -EACCES;
+		dev_err(&pdev->dev, "Enabling PCI device has failed: %d", err);
+		return err;
 	}
 
 	pci_set_master(pdev);
 
 	/* set up resources */
 	memset(res, 0x00, sizeof(struct resource) * ARRAY_SIZE(res));
-	res[0].start = pdev->resource[0].start;
-	res[0].end = pdev->resource[0].end;
+	res[0].start = pci_resource_start(pdev, 0);
+	res[0].end = pci_resource_end(pdev, 0);
 	res[0].name = PCI_DRIVER_NAME;
 	res[0].flags = IORESOURCE_MEM;
-	res[1].start = pdev->irq;
+	res[1].start = pci_irq_vector(pdev, 0);
 	res[1].name = PCI_DRIVER_NAME;
 	res[1].flags = IORESOURCE_IRQ;
 
-	dev_info(&pdev->dev, "EMAC physical base addr = 0x%p\n",
-		 (void *)(uintptr_t)pci_resource_start(pdev, 0));
+	dev_info(&pdev->dev, "EMAC physical base addr: %pa\n",
+		 &res[0].start);
 
 	/* set up macb platform data */
 	memset(&plat_data, 0, sizeof(plat_data));
@@ -100,7 +95,7 @@ static int macb_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	plat_info.num_res = ARRAY_SIZE(res);
 	plat_info.data = &plat_data;
 	plat_info.size_data = sizeof(plat_data);
-	plat_info.dma_mask = DMA_BIT_MASK(32);
+	plat_info.dma_mask = pdev->dma_mask;
 
 	/* register platform device */
 	plat_dev = platform_device_register_full(&plat_info);
@@ -120,7 +115,6 @@ static int macb_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	clk_unregister(plat_data.pclk);
 
 err_pclk_register:
-	pci_disable_device(pdev);
 	return err;
 }
 
@@ -130,7 +124,6 @@ static void macb_remove(struct pci_dev *pdev)
 	struct macb_platform_data *plat_data = dev_get_platdata(&plat_dev->dev);
 
 	platform_device_unregister(plat_dev);
-	pci_disable_device(pdev);
 	clk_unregister(plat_data->pclk);
 	clk_unregister(plat_data->hclk);
 }
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next v4 2/2] net: dsa: mv88e6xxx: Add support for ethernet switch 88E6341/88E6141
From: Romain Perier @ 2016-12-21 12:57 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Jason Cooper,
	Sebastian Hesselbarth, Gregory Clement
  Cc: netdev, devicetree, Rob Herring, Ian Campbell, Pawel Moll,
	Mark Rutland, Kumar Gala, linux-arm-kernel, Thomas Petazzoni,
	Nadav Haklai, Romain Perier
In-Reply-To: <20161221125734.1034-1-romain.perier@free-electrons.com>

The Marvell 88E6341 device is single-chip, 6-port ethernet switch with
four integrated 10/100/1000Mbps ethernet transceivers and one high speed
SerDes interfaces. It is compatible with switches of family 88E6352.

This commit adds basic support for this switch by describing its
capabilities to the driver.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---

Changes in v3:
 - Added tag "Reviewed-by" by Andrew
 
Changes in v2:
 - Add a dedicated data structure for the operations of the 88E6341
 - Re-ordered PORT_SWITCH_ID_PROD_NUM_6341 in alphabetic order with other
   macros

 drivers/net/dsa/mv88e6xxx/chip.c      | 42 +++++++++++++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  4 +++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 76d944e..5e97dc4 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3625,6 +3625,34 @@ static const struct mv88e6xxx_ops mv88e6321_ops = {
 	.reset = mv88e6352_g1_reset,
 };
 
+static const struct mv88e6xxx_ops mv88e6341_ops = {
+	/* MV88E6XXX_FAMILY_6352 */
+	.get_eeprom = mv88e6xxx_g2_get_eeprom16,
+	.set_eeprom = mv88e6xxx_g2_set_eeprom16,
+	.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+	.phy_read = mv88e6xxx_g2_smi_phy_read,
+	.phy_write = mv88e6xxx_g2_smi_phy_write,
+	.port_set_link = mv88e6xxx_port_set_link,
+	.port_set_duplex = mv88e6xxx_port_set_duplex,
+	.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
+	.port_set_speed = mv88e6352_port_set_speed,
+	.port_tag_remap = mv88e6095_port_tag_remap,
+	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
+	.port_set_egress_unknowns = mv88e6351_port_set_egress_unknowns,
+	.port_set_ether_type = mv88e6351_port_set_ether_type,
+	.port_jumbo_config = mv88e6165_port_jumbo_config,
+	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_pause_config = mv88e6097_port_pause_config,
+	.stats_snapshot = mv88e6320_g1_stats_snapshot,
+	.stats_get_sset_count = mv88e6095_stats_get_sset_count,
+	.stats_get_strings = mv88e6095_stats_get_strings,
+	.stats_get_stats = mv88e6095_stats_get_stats,
+	.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
+	.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+	.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
+	.reset = mv88e6352_g1_reset,
+};
+
 static const struct mv88e6xxx_ops mv88e6350_ops = {
 	/* MV88E6XXX_FAMILY_6351 */
 	.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
@@ -4086,6 +4114,20 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.ops = &mv88e6321_ops,
 	},
 
+	[MV88E6341] = {
+		.prod_num = PORT_SWITCH_ID_PROD_NUM_6341,
+		.family = MV88E6XXX_FAMILY_6352,
+		.name = "Marvell 88E6341",
+		.num_databases = 4096,
+		.num_ports = 6,
+		.port_base_addr = 0x10,
+		.global1_addr = 0x1b,
+		.age_time_coeff = 15000,
+		.tag_protocol = DSA_TAG_PROTO_EDSA,
+		.flags = MV88E6XXX_FLAGS_FAMILY_6352,
+		.ops = &mv88e6341_ops,
+	},
+
 	[MV88E6350] = {
 		.prod_num = PORT_SWITCH_ID_PROD_NUM_6350,
 		.family = MV88E6XXX_FAMILY_6351,
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index af54bae..cb55fdb 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -100,6 +100,7 @@
 #define PORT_SWITCH_ID_PROD_NUM_6240	0x240
 #define PORT_SWITCH_ID_PROD_NUM_6290	0x290
 #define PORT_SWITCH_ID_PROD_NUM_6321	0x310
+#define PORT_SWITCH_ID_PROD_NUM_6341	0x340
 #define PORT_SWITCH_ID_PROD_NUM_6352	0x352
 #define PORT_SWITCH_ID_PROD_NUM_6350	0x371
 #define PORT_SWITCH_ID_PROD_NUM_6351	0x375
@@ -432,6 +433,7 @@ enum mv88e6xxx_model {
 	MV88E6290,
 	MV88E6320,
 	MV88E6321,
+	MV88E6341,
 	MV88E6350,
 	MV88E6351,
 	MV88E6352,
@@ -448,7 +450,7 @@ enum mv88e6xxx_family {
 	MV88E6XXX_FAMILY_6185,	/* 6108 6121 6122 6131 6152 6155 6182 6185 */
 	MV88E6XXX_FAMILY_6320,	/* 6320 6321 */
 	MV88E6XXX_FAMILY_6351,	/* 6171 6175 6350 6351 */
-	MV88E6XXX_FAMILY_6352,	/* 6172 6176 6240 6352 */
+	MV88E6XXX_FAMILY_6352,	/* 6172 6176 6240 6341 6352 */
 	MV88E6XXX_FAMILY_6390,  /* 6190 6190X 6191 6290 6390 6390X */
 };
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next v4 1/2] net: dsa: mv88e6xxx: Don't forbid MDIO I/Os for PHY addr >= num_of_ports
From: Romain Perier @ 2016-12-21 12:57 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Jason Cooper,
	Sebastian Hesselbarth, Gregory Clement
  Cc: netdev, devicetree, Rob Herring, Ian Campbell, Pawel Moll,
	Mark Rutland, Kumar Gala, linux-arm-kernel, Thomas Petazzoni,
	Nadav Haklai, Romain Perier
In-Reply-To: <20161221125734.1034-1-romain.perier@free-electrons.com>

Some Marvell ethernet switches have internal ethernet transceivers with
hardcoded phy addresses. These addresses can be greater than the number
of ports or its value might be different than the associated port number.
This is for example the case for MV88E6341 that has 6 ports and internal
Port 1 to Port4 PHYs mapped at SMI addresses from 0x11 to 0x14.

This commits fixes the issue by removing the condition in MDIO callbacks.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---

Changes in v4:
 - Fixed typo in the commit log

Changes in v2:
 - Added tag "Reviewed-by" by Andrew
 - Fixed typo in the commit log

 drivers/net/dsa/mv88e6xxx/chip.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index b5f0e1e..76d944e 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2881,9 +2881,6 @@ static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg)
 	u16 val;
 	int err;
 
-	if (phy >= mv88e6xxx_num_ports(chip))
-		return 0xffff;
-
 	mutex_lock(&chip->reg_lock);
 	err = mv88e6xxx_phy_read(chip, phy, reg, &val);
 	mutex_unlock(&chip->reg_lock);
@@ -2896,9 +2893,6 @@ static int mv88e6xxx_mdio_write(struct mii_bus *bus, int phy, int reg, u16 val)
 	struct mv88e6xxx_chip *chip = bus->priv;
 	int err;
 
-	if (phy >= mv88e6xxx_num_ports(chip))
-		return 0xffff;
-
 	mutex_lock(&chip->reg_lock);
 	err = mv88e6xxx_phy_write(chip, phy, reg, val);
 	mutex_unlock(&chip->reg_lock);
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next v4 0/2] Add support for the ethernet switch on the ESPRESSObin
From: Romain Perier @ 2016-12-21 12:57 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Jason Cooper,
	Sebastian Hesselbarth, Gregory Clement
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Rob Herring, Ian Campbell, Pawel Moll, Mark Rutland, Kumar Gala,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Thomas Petazzoni, Nadav Haklai, Romain Perier

This set of patches adds support for the Marvell ethernet switch 88E6341.
It also add the devicetree definition of this switch to the DT board.

Romain Perier (2):
  net: dsa: mv88e6xxx: Don't forbid MDIO I/Os for PHY addr >=
    num_of_ports
  net: dsa: mv88e6xxx: Add support for ethernet switch 88E6341/88E6141

 drivers/net/dsa/mv88e6xxx/chip.c      | 48 ++++++++++++++++++++++++++++++-----
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  4 ++-
 2 files changed, 45 insertions(+), 7 deletions(-)

-- 

Note: As requested by Gregory, I have removed the patch for the DT (already merged).

2.9.3

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH 5/7] Multiple VF's grouped together under single physical port called PF group PF Group maintainance API's
From: Sunil Kovvuri @ 2016-12-21 12:43 UTC (permalink / raw)
  To: Satha Koteswara Rao
  Cc: LKML, Sunil Goutham, Robert Richter, David S. Miller, David Daney,
	rvatsavayi, derek.chickles, philip.romanov, Linux Netdev List,
	LAKML
In-Reply-To: <1482310011-1862-6-git-send-email-satha.rao@caviumnetworks.com>

On Wed, Dec 21, 2016 at 2:16 PM, Satha Koteswara Rao
<satha.rao@caviumnetworks.com> wrote:
> +struct tns_global_st {
> +       u64 magic;
> +       char     version[16];
> +       u64 reg_cnt;
> +       struct table_static_s tbl_info[TNS_MAX_TABLE];
> +};
> +
> +#define PF_COUNT 3
> +#define PF_1   0
> +#define PF_2   64
> +#define PF_3   96
> +#define PF_END 128

Some comments please ... what is 0, 64, 96 ??
You can read PCI_SRIOV_TOTAL_VF from PCI config space instead of
defining PF_END with 128.

^ permalink raw reply

* Re: ipv6: handle -EFAULT from skb_copy_bits
From: Hannes Frederic Sowa @ 2016-12-21 12:41 UTC (permalink / raw)
  To: Cong Wang, Dave Jones; +Cc: David Miller, Linux Kernel Network Developers
In-Reply-To: <1482323232.2260.2.camel@stressinduktion.org>

On Wed, 2016-12-21 at 13:27 +0100, Hannes Frederic Sowa wrote: 
> @@ -555,8 +566,8 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
>                 goto out;
>  
>         offset = rp->offset;
> -       total_len = inet_sk(sk)->cork.base.length;
> -       if (offset >= total_len - 1) {
> +       transport_len = raw6_corked_transport_len(sk);
> +       if (offset >= transport_len - 1) {
>                 err = -EINVAL;
>                 ip6_flush_pending_frames(sk);
>                 goto out;
> @@ -598,7 +609,7 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
>                 tmp_csum = csum_sub(tmp_csum, csum_unfold(csum));
>  
>         csum = csum_ipv6_magic(&fl6->saddr, &fl6->daddr,
> -                              total_len, fl6->flowi6_proto, tmp_csum);
> +                              transport_len, fl6->flowi6_proto, tmp_csum);
>  
> 

Ops, here we need actually the total_len plus the opt->opt_nflen to
always calculate the correct checksum.

^ permalink raw reply

* Re: [RFC PATCH 4/7] HW Filter Initialization code and register access APIs
From: Sunil Kovvuri @ 2016-12-21 12:36 UTC (permalink / raw)
  To: Satha Koteswara Rao
  Cc: LKML, Sunil Goutham, Robert Richter, David S. Miller, David Daney,
	rvatsavayi, derek.chickles, philip.romanov, Linux Netdev List,
	LAKML
In-Reply-To: <1482310011-1862-5-git-send-email-satha.rao@caviumnetworks.com>

On Wed, Dec 21, 2016 at 2:16 PM, Satha Koteswara Rao
<satha.rao@caviumnetworks.com> wrote:
> ---
>  drivers/net/ethernet/cavium/thunder/pf_reg.c | 660 +++++++++++++++++++++++++++
>  1 file changed, 660 insertions(+)
>  create mode 100644 drivers/net/ethernet/cavium/thunder/pf_reg.c
>
> diff --git a/drivers/net/ethernet/cavium/thunder/pf_reg.c b/drivers/net/ethernet/cavium/thunder/pf_reg.c

Sunil>>
>From the file name 'pf_reg.c', what is PF here ?
TNS is not a SRIOV device right ?

> new file mode 100644
> index 0000000..1f95c7f
> --- /dev/null
> +++ b/drivers/net/ethernet/cavium/thunder/pf_reg.c
> @@ -0,0 +1,660 @@
> +/*
> + * Copyright (C) 2015 Cavium, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of version 2 of the GNU General Public License
> + * as published by the Free Software Foundation.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/fs.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/version.h>
> +#include <linux/proc_fs.h>
> +#include <linux/device.h>
> +#include <linux/mman.h>
> +#include <linux/uaccess.h>
> +#include <linux/delay.h>
> +#include <linux/cdev.h>
> +#include <linux/err.h>
> +#include <linux/device.h>
> +#include <linux/io.h>
> +#include <linux/firmware.h>
> +#include "pf_globals.h"
> +#include "pf_locals.h"
> +#include "tbl_access.h"
> +#include "linux/lz4.h"
> +
> +struct tns_table_s tbl_info[TNS_MAX_TABLE];
> +
> +#define TNS_TDMA_SST_ACC_CMD_ADDR      0x0000842000000270ull
> +
> +#define BAR0_START 0x842000000000
> +#define BAR0_END   0x84200000FFFF
> +#define BAR0_SIZE  (64 * 1024)
> +#define BAR2_START 0x842040000000
> +#define BAR2_END   0x84207FFFFFFF
> +#define BAR2_SIZE  (1024 * 1024 * 1024)
> +
> +#define NODE1_BAR0_START 0x942000000000
> +#define NODE1_BAR0_END   0x94200000FFFF
> +#define NODE1_BAR0_SIZE  (64 * 1024)
> +#define NODE1_BAR2_START 0x942040000000
> +#define NODE1_BAR2_END   0x94207FFFFFFF
> +#define NODE1_BAR2_SIZE  (1024 * 1024 * 1024)

Sunil>> This is absurd, why are you using hardcoded HW addresses,
why not use TNS device's PCI BARs.

> +/* Allow a max of 4 chunks for the Indirect Read/Write */
> +#define MAX_SIZE (64 * 4)
> +#define CHUNK_SIZE (64)
> +/* To protect register access */
> +spinlock_t pf_reg_lock;
> +
> +u64 iomem0;
> +u64 iomem2;
> +u8 tns_enabled;
> +u64 node1_iomem0;
> +u64 node1_iomem2;
> +u8 node1_tns;
> +int n1_tns;

Sunil>> A simple structure would have nice instead of so many global variables.

> +
> +int tns_write_register_indirect(int node_id, u64 address, u8 size,
> +                               u8 *kern_buffer)
> +{
> +       union tns_tdma_sst_acc_cmd acccmd;
> +       union tns_tdma_sst_acc_stat_t accstat;
> +       union tns_acc_data data;
> +       int i, j, w = 0;
> +       int cnt = 0;
> +       u32 *dataw = NULL;
> +       int temp = 0;
> +       int k = 0;
> +       int chunks = 0;
> +       u64 acccmd_address;
> +       u64 lmem2 = 0, lmem0 = 0;
> +
> +       if (size == 0 || !kern_buffer) {
> +               filter_dbg(FERR, "%s data size cannot be zero\n", __func__);
> +               return TNS_ERROR_INVALID_ARG;
> +       }
> +       if (size > MAX_SIZE) {
> +               filter_dbg(FERR, "%s Max allowed size exceeded\n", __func__);
> +               return TNS_ERROR_DATA_TOO_LARGE;
> +       }
> +       if (node_id) {
> +               lmem0 = node1_iomem0;
> +               lmem2 = node1_iomem2;
> +       } else {
> +               lmem0 = iomem0;
> +               lmem2 = iomem2;
> +       }
> +
> +       chunks = ((size + (CHUNK_SIZE - 1)) / CHUNK_SIZE);
> +       acccmd_address = (address & 0x00000000ffffffff);
> +       spin_lock_bh(&pf_reg_lock);
> +
> +       for (k = 0; k < chunks; k++) {

Sunil>> Why not use some proper variable names, instead of i,j,k,w,
temp e.t.c e.t.c


> +               /* Should never happen */
> +               if (size < 0) {
> +                       filter_dbg(FERR, "%s size mismatch [CHUNK %d]\n",
> +                                  __func__, k);
> +                       break;
> +               }
> +               temp = (size > CHUNK_SIZE) ? CHUNK_SIZE : size;
> +               dataw = (u32 *)(kern_buffer + (k * CHUNK_SIZE));
> +               cnt = ((temp + 3) / 4);
> +               data.u = 0ULL;
> +               for (j = 0, i = 0; i < cnt; i++) {
> +                       /* Odd words go in the upper 32 bits of the data
> +                        * register
> +                        */
> +                       if (i & 1) {
> +                               data.s.upper32 = dataw[i];
> +                               writeq_relaxed(data.u, (void *)(lmem0 +
> +                                              TNS_TDMA_SST_ACC_WDATX(j)));
> +                               data.u = 0ULL;
> +                               j++; /* Advance to the next data word */
> +                               w = 0;
> +                       } else {
> +                               /* Lower 32 bits contain words 0, 2, 4, etc. */
> +                               data.s.lower32 = dataw[i];
> +                               w = 1;
> +                       }
> +               }
> +
> +               /* If the last word was a partial (< 64 bits) then
> +                * see if we need to write it.
> +                */
> +               if (w)
> +                       writeq_relaxed(data.u, (void *)(lmem0 +
> +                                      TNS_TDMA_SST_ACC_WDATX(j)));
> +
> +               acccmd.u = 0ULL;
> +               acccmd.s.go = 1; /* Cleared once the request is serviced */
> +               acccmd.s.size = cnt;
> +               acccmd.s.addr = (acccmd_address >> 2);
> +               writeq_relaxed(acccmd.u, (void *)(lmem0 +
> +                              TDMA_SST_ACC_CMD));
> +               accstat.u = 0ULL;
> +
> +               while (!accstat.s.cmd_done && !accstat.s.error)
> +                       accstat.u = readq_relaxed((void *)(lmem0 +
> +                                         TDMA_SST_ACC_STAT));
> +
> +               if (accstat.s.error) {
> +                       data.u = readq_relaxed((void *)(lmem2 +
> +                                              TDMA_NB_INT_STAT));
> +                       filter_dbg(FERR, "%s Reading data from ", __func__);
> +                       filter_dbg(FERR, "0x%0lx chunk %d failed 0x%0lx",
> +                                  (unsigned long)address, k,
> +                                  (unsigned long)data.u);
> +                       spin_unlock_bh(&pf_reg_lock);
> +                       kfree(kern_buffer);
> +                       return TNS_ERROR_INDIRECT_WRITE;
> +               }
> +               /* Calculate the next offset to write */
> +               acccmd_address = acccmd_address + CHUNK_SIZE;
> +               size -= CHUNK_SIZE;
> +       }
> +       spin_unlock_bh(&pf_reg_lock);
> +
> +       return 0;
> +}
> +
> +int tns_read_register_indirect(int node_id, u64 address, u8 size,
> +                              u8 *kern_buffer)
> +{
> +       union tns_tdma_sst_acc_cmd acccmd;
> +       union tns_tdma_sst_acc_stat_t accstat;
> +       union tns_acc_data data;
> +       int i, j, dcnt;
> +       int cnt = 0;
> +       u32 *dataw = NULL;
> +       int temp = 0;
> +       int k = 0;
> +       int chunks = 0;
> +       u64 acccmd_address;
> +       u64 lmem2 = 0, lmem0 = 0;
> +
> +       if (size == 0 || !kern_buffer) {
> +               filter_dbg(FERR, "%s data size cannot be zero\n", __func__);
> +               return TNS_ERROR_INVALID_ARG;
> +       }
> +       if (size > MAX_SIZE) {
> +               filter_dbg(FERR, "%s Max allowed size exceeded\n", __func__);
> +               return TNS_ERROR_DATA_TOO_LARGE;
> +       }
> +       if (node_id) {
> +               lmem0 = node1_iomem0;
> +               lmem2 = node1_iomem2;
> +       } else {
> +               lmem0 = iomem0;
> +               lmem2 = iomem2;
> +       }
> +
> +       chunks = ((size + (CHUNK_SIZE - 1)) / CHUNK_SIZE);
> +       acccmd_address = (address & 0x00000000ffffffff);
> +       spin_lock_bh(&pf_reg_lock);
> +       for (k = 0; k < chunks; k++) {
> +               /* This should never happen */
> +               if (size < 0) {
> +                       filter_dbg(FERR, "%s size mismatch [CHUNK:%d]\n",
> +                                  __func__, k);
> +                       break;
> +               }
> +               temp = (size > CHUNK_SIZE) ? CHUNK_SIZE : size;
> +               dataw = (u32 *)(kern_buffer + (k * CHUNK_SIZE));
> +               cnt = ((temp + 3) / 4);
> +               acccmd.u = 0ULL;
> +               acccmd.s.op = 1; /* Read operation */
> +               acccmd.s.size = cnt;
> +               acccmd.s.addr = (acccmd_address >> 2);
> +               acccmd.s.go = 1; /* Execute */
> +               writeq_relaxed(acccmd.u, (void *)(lmem0 +
> +                              TDMA_SST_ACC_CMD));
> +               accstat.u = 0ULL;
> +
> +               while (!accstat.s.cmd_done && !accstat.s.error)
> +                       accstat.u = readq_relaxed((void *)(lmem0 +
> +                                                 TDMA_SST_ACC_STAT));
> +
> +               if (accstat.s.error) {
> +                       data.u = readq_relaxed((void *)(lmem2 +
> +                                              TDMA_NB_INT_STAT));
> +                       filter_dbg(FERR, "%s Reading data from", __func__);
> +                       filter_dbg(FERR, "0x%0lx chunk %d failed 0x%0lx",
> +                                  (unsigned long)address, k,
> +                                  (unsigned long)data.u);
> +                       spin_unlock_bh(&pf_reg_lock);
> +                       kfree(kern_buffer);
> +                       return TNS_ERROR_INDIRECT_READ;
> +               }
> +
> +               dcnt = cnt / 2;
> +               if (cnt & 1)
> +                       dcnt++;
> +               for (i = 0, j = 0; (j < dcnt) && (i < cnt); j++) {
> +                       data.u = readq_relaxed((void *)(lmem0 +
> +                                              TNS_TDMA_SST_ACC_RDATX(j)));
> +                       dataw[i++] = data.s.lower32;
> +                       if (i < cnt)
> +                               dataw[i++] = data.s.upper32;
> +               }
> +               /* Calculate the next offset to read */
> +               acccmd_address = acccmd_address + CHUNK_SIZE;
> +               size -= CHUNK_SIZE;
> +       }
> +       spin_unlock_bh(&pf_reg_lock);
> +       return 0;
> +}
> +
> +u64 tns_read_register(u64 start, u64 offset)
> +{
> +       return readq_relaxed((void *)(start + offset));
> +}
> +
> +void tns_write_register(u64 start, u64 offset, u64 data)
> +{
> +       writeq_relaxed(data, (void *)(start + offset));
> +}
> +
> +/* Check if TNS is available. If yes return 0 else 1 */
> +int is_tns_available(void)
> +{
> +       union tns_tdma_cap tdma_cap;
> +
> +       tdma_cap.u = tns_read_register(iomem0, TNS_TDMA_CAP_OFFSET);
> +       tns_enabled = tdma_cap.s.switch_capable;
> +       /* In multi-node systems, make sure TNS should be there in both nodes */

Can't node-0 TNS work with node-0 interfaces if node-1 TNS is not detected ?

> +       if (nr_node_ids > 1) {
> +               tdma_cap.u = tns_read_register(node1_iomem0,
> +                                              TNS_TDMA_CAP_OFFSET);
> +               if (tdma_cap.s.switch_capable)
> +                       n1_tns = 1;
> +       }
> +       tns_enabled &= tdma_cap.s.switch_capable;
> +       return (!tns_enabled);
> +}
> +
> +int bist_error_check(void)
> +{
> +       int fail = 0, i;
> +       u64 bist_stat = 0;
> +
> +       for (i = 0; i < 12; i++) {
> +               bist_stat = tns_read_register(iomem0, (i * 16));
> +               if (bist_stat) {
> +                       filter_dbg(FERR, "TNS BIST%d fail 0x%llx\n",
> +                                  i, bist_stat);
> +                       fail = 1;
> +               }
> +               if (!n1_tns)
> +                       continue;
> +               bist_stat = tns_read_register(node1_iomem0, (i * 16));
> +               if (bist_stat) {
> +                       filter_dbg(FERR, "TNS(N1) BIST%d fail 0x%llx\n",
> +                                  i, bist_stat);
> +                       fail = 1;
> +               }
> +       }
> +
> +       return fail;
> +}
> +
> +int replay_indirect_trace(int node, u64 *buf_ptr, int idx)
> +{
> +       union _tns_sst_config cmd = (union _tns_sst_config)(buf_ptr[idx]);
> +       int remaining = cmd.cmd.run;
> +       u64 io_addr;
> +       int word_cnt = cmd.cmd.word_cnt;
> +       int size = (word_cnt + 1) / 2;
> +       u64 stride = word_cnt;
> +       u64 acc_cmd = cmd.copy.do_copy;
> +       u64 lmem2 = 0, lmem0 = 0;
> +       union tns_tdma_sst_acc_stat_t accstat;
> +       union tns_acc_data data;
> +
> +       if (node) {
> +               lmem0 = node1_iomem0;
> +               lmem2 = node1_iomem2;
> +       } else {
> +               lmem0 = iomem0;
> +               lmem2 = iomem2;
> +       }
> +
> +       if (word_cnt == 0) {
> +               word_cnt = 16;
> +               stride = 16;
> +               size = 8;
> +       } else {
> +               // make stride next power of 2

Please use proper commenting, have you ran checkpatch ?

> +               if (cmd.cmd.powerof2stride)
> +                       while ((stride & (stride - 1)) != 0)
> +                               stride++;
> +       }
> +       stride *= 4; //convert stride from 32-bit words to bytes
> +
> +       do {
> +               int addr_p = 1;
> +               /* extract (big endian) data from the config
> +                * into the data array
> +                */
> +               while (size > 0) {
> +                       io_addr = lmem0 + TDMA_SST_ACC_CMD + addr_p * 16;
> +                       tns_write_register(io_addr, 0, buf_ptr[idx + size]);
> +                       addr_p += 1;
> +                       size--;
> +               }
> +               tns_write_register((lmem0 + TDMA_SST_ACC_CMD), 0, acc_cmd);
> +               /* TNS Block access registers indirectly, ran memory barrier
> +                * between two writes
> +                */
> +               wmb();
> +               /* Check for completion */
> +               accstat.u = 0ULL;
> +               while (!accstat.s.cmd_done && !accstat.s.error)
> +                       accstat.u = readq_relaxed((void *)(lmem0 +
> +                                                          TDMA_SST_ACC_STAT));
> +
> +               /* Check for error, and report it */
> +               if (accstat.s.error) {
> +                       filter_dbg(FERR, "%s data from 0x%0llx failed 0x%llx\n",
> +                                  __func__, acc_cmd, accstat.u);
> +                       data.u = readq_relaxed((void *)(lmem2 +
> +                                                       TDMA_NB_INT_STAT));
> +                       filter_dbg(FERR, "Status 0x%llx\n", data.u);
> +               }
> +               /* update the address */
> +               acc_cmd += stride;
> +               size = (word_cnt + 1) / 2;
> +               usleep_range(20, 30);
> +       } while (remaining-- > 0);
> +
> +       return size;
> +}
> +
> +void replay_tns_node(int node, u64 *buf_ptr, int reg_cnt)
> +{
> +       int counter = 0;
> +       u64 offset = 0;
> +       u64 io_address;
> +       int datapathmode = 1;
> +       u64 lmem2 = 0, lmem0 = 0;
> +
> +       if (node) {
> +               lmem0 = node1_iomem0;
> +               lmem2 = node1_iomem2;
> +       } else {
> +               lmem0 = iomem0;
> +               lmem2 = iomem2;
> +       }
> +       for (counter = 0; counter < reg_cnt; counter++) {
> +               if (buf_ptr[counter] == 0xDADADADADADADADAull) {
> +                       datapathmode = 1;
> +                       continue;
> +               } else if (buf_ptr[counter] == 0xDEDEDEDEDEDEDEDEull) {
> +                       datapathmode = 0;
> +                       continue;
> +               }
> +               if (datapathmode == 1) {
> +                       if (buf_ptr[counter] >= BAR0_START &&
> +                           buf_ptr[counter] <= BAR0_END) {
> +                               offset = buf_ptr[counter] - BAR0_START;
> +                               io_address = lmem0 + offset;
> +                       } else if (buf_ptr[counter] >= BAR2_START &&
> +                                  buf_ptr[counter] <= BAR2_END) {
> +                               offset = buf_ptr[counter] - BAR2_START;
> +                               io_address = lmem2 + offset;
> +                       } else {
> +                               filter_dbg(FERR, "%s Address 0x%llx invalid\n",
> +                                          __func__, buf_ptr[counter]);
> +                               return;
> +                       }
> +
> +                       tns_write_register(io_address, 0, buf_ptr[counter + 1]);
> +                       /* TNS Block access registers indirectly, ran memory
> +                        * barrier between two writes
> +                        */
> +                       wmb();
> +                       counter += 1;
> +                       usleep_range(20, 30);
> +               } else if (datapathmode == 0) {
> +                       int sz = replay_indirect_trace(node, buf_ptr, counter);
> +
> +                       counter += sz;
> +               }
> +       }
> +}
> +
> +int alloc_table_info(int i, struct table_static_s tbl_sdata[])
> +{
> +       tbl_info[i].ddata[0].bitmap = kcalloc(BITS_TO_LONGS(tbl_sdata[i].depth),
> +                                             sizeof(uintptr_t), GFP_KERNEL);
> +       if (!tbl_info[i].ddata[0].bitmap)
> +               return 1;
> +
> +       if (!n1_tns)
> +               return 0;
> +
> +       tbl_info[i].ddata[1].bitmap = kcalloc(BITS_TO_LONGS(tbl_sdata[i].depth),
> +                                             sizeof(uintptr_t), GFP_KERNEL);
> +       if (!tbl_info[i].ddata[1].bitmap) {
> +               kfree(tbl_info[i].ddata[0].bitmap);
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +void tns_replay_register_trace(const struct firmware *fw, struct device *dev)
> +{
> +       int i;
> +       int node = 0;
> +       u8 *buffer = NULL;
> +       u64 *buf_ptr = NULL;
> +       struct tns_global_st *fw_header = NULL;
> +       struct table_static_s tbl_sdata[TNS_MAX_TABLE];
> +       size_t src_len;
> +       size_t dest_len = TNS_FW_MAX_SIZE;
> +       int rc;
> +       u8 *fw2_buf = NULL;
> +       unsigned char *decomp_dest = NULL;
> +
> +       fw2_buf = (u8 *)fw->data;
> +       src_len = fw->size - 8;
> +
> +       decomp_dest = kcalloc((dest_len * 2), sizeof(char), GFP_KERNEL);
> +       if (!decomp_dest)
> +               return;
> +
> +       memset(decomp_dest, 0, (dest_len * 2));
> +       rc = lz4_decompress_unknownoutputsize(&fw2_buf[8], src_len, decomp_dest,
> +                                             &dest_len);
> +       if (rc) {
> +               filter_dbg(FERR, "Decompress Error %d\n", rc);
> +               pr_info("Uncompressed destination length %ld\n", dest_len);
> +               kfree(decomp_dest);
> +               return;
> +       }
> +       fw_header = (struct tns_global_st *)decomp_dest;
> +       buffer = (u8 *)decomp_dest;
> +
> +       filter_dbg(FINFO, "TNS Firmware version: %s Loading...\n",
> +                  fw_header->version);
> +
> +       memset(tbl_info, 0x0, sizeof(tbl_info));
> +       buf_ptr = (u64 *)(buffer + sizeof(struct tns_global_st));
> +       memcpy(tbl_sdata, fw_header->tbl_info, sizeof(fw_header->tbl_info));
> +
> +       for (i = 0; i < TNS_MAX_TABLE; i++) {
> +               if (!tbl_sdata[i].valid)
> +                       continue;
> +               memcpy(&tbl_info[i].sdata, &tbl_sdata[i],
> +                      sizeof(struct table_static_s));
> +               if (alloc_table_info(i, tbl_sdata)) {
> +                       kfree(decomp_dest);
> +                       return;
> +               }
> +       }
> +
> +       for (node = 0; node < nr_node_ids; node++)
> +               replay_tns_node(node, buf_ptr, fw_header->reg_cnt);
> +
> +       kfree(decomp_dest);
> +       release_firmware(fw);
> +}
> +
> +int tns_init(const struct firmware *fw, struct device *dev)
> +{
> +       int result = 0;
> +       int i = 0;
> +       int temp;
> +       union tns_tdma_config tdma_config;
> +       union tns_tdma_lmacx_config tdma_lmac_cfg;
> +       u64 reg_init_val;
> +
> +       spin_lock_init(&pf_reg_lock);
> +
> +       /* use two regions insted of a single big mapping to save
> +        * the kernel virtual space
> +        */
> +       iomem0 = (u64)ioremap(BAR0_START, BAR0_SIZE);
> +       if (iomem0 == 0ULL) {
> +               filter_dbg(FERR, "Node0 ioremap failed for BAR0\n");
> +               result = -EAGAIN;
> +               goto error;
> +       } else {
> +               filter_dbg(FINFO, "ioremap success for BAR0\n");
> +       }
> +
> +       if (nr_node_ids > 1) {
> +               node1_iomem0 = (u64)ioremap(NODE1_BAR0_START, NODE1_BAR0_SIZE);
> +               if (node1_iomem0 == 0ULL) {
> +                       filter_dbg(FERR, "Node1 ioremap failed for BAR0\n");
> +                       result = -EAGAIN;
> +                       goto error;
> +               } else {
> +                       filter_dbg(FINFO, "ioremap success for BAR0\n");
> +               }
> +       }
> +
> +       if (is_tns_available()) {
> +               filter_dbg(FERR, "TNS NOT AVAILABLE\n");
> +               goto error;
> +       }
> +
> +       if (bist_error_check()) {
> +               filter_dbg(FERR, "BIST ERROR CHECK FAILED");
> +               goto error;
> +       }
> +
> +       /* NIC0-BGX0 is TNS, NIC1-BGX1 is TNS, DISABLE BACKPRESSURE */

Sunil>> Why disable backpressure, if it's in TNS mode ?

> +       reg_init_val = 0ULL;
> +       pr_info("NIC Block configured in TNS/TNS mode");
> +       tns_write_register(iomem0, TNS_RDMA_CONFIG_OFFSET, reg_init_val);
> +       usleep_range(10, 20);

Sunil>> Why sleep after every register write ?

> +       if (n1_tns) {
> +               tns_write_register(node1_iomem0, TNS_RDMA_CONFIG_OFFSET,
> +                                  reg_init_val);
> +               usleep_range(10, 20);
> +       }
> +
> +       // Configure each LMAC with 512 credits in BYPASS mode
> +       for (i = TNS_MIN_LMAC; i < (TNS_MIN_LMAC + TNS_MAX_LMAC); i++) {
> +               tdma_lmac_cfg.u = 0ULL;
> +               tdma_lmac_cfg.s.fifo_cdts = 0x200;
> +               tns_write_register(iomem0, TNS_TDMA_LMACX_CONFIG_OFFSET(i),
> +                                  tdma_lmac_cfg.u);
> +               usleep_range(10, 20);
> +               if (n1_tns) {
> +                       tns_write_register(node1_iomem0,
> +                                          TNS_TDMA_LMACX_CONFIG_OFFSET(i),
> +                                          tdma_lmac_cfg.u);
> +                       usleep_range(10, 20);
> +               }
> +       }
> +
> +       //ENABLE TNS CLOCK AND CSR READS
> +       temp = tns_read_register(iomem0, TNS_TDMA_CONFIG_OFFSET);
> +       tdma_config.u = temp;
> +       tdma_config.s.clk_2x_ena = 1;
> +       tdma_config.s.clk_ena = 1;
> +       tns_write_register(iomem0, TNS_TDMA_CONFIG_OFFSET, tdma_config.u);
> +       if (n1_tns)
> +               tns_write_register(node1_iomem0, TNS_TDMA_CONFIG_OFFSET,
> +                                  tdma_config.u);
> +
> +       temp = tns_read_register(iomem0, TNS_TDMA_CONFIG_OFFSET);
> +       tdma_config.u = temp;
> +       tdma_config.s.csr_access_ena = 1;
> +       tns_write_register(iomem0, TNS_TDMA_CONFIG_OFFSET, tdma_config.u);
> +       if (n1_tns)
> +               tns_write_register(node1_iomem0, TNS_TDMA_CONFIG_OFFSET,
> +                                  tdma_config.u);
> +
> +       reg_init_val = 0ULL;
> +       tns_write_register(iomem0, TNS_TDMA_RESET_CTL_OFFSET, reg_init_val);
> +       if (n1_tns)
> +               tns_write_register(node1_iomem0, TNS_TDMA_RESET_CTL_OFFSET,
> +                                  reg_init_val);
> +
> +       iomem2 = (u64)ioremap(BAR2_START, BAR2_SIZE);
> +       if (iomem2 == 0ULL) {
> +               filter_dbg(FERR, "ioremap failed for BAR2\n");
> +               result = -EAGAIN;
> +               goto error;
> +       } else {
> +               filter_dbg(FINFO, "ioremap success for BAR2\n");
> +       }
> +
> +       if (n1_tns) {
> +               node1_iomem2 = (u64)ioremap(NODE1_BAR2_START, NODE1_BAR2_SIZE);
> +               if (node1_iomem2 == 0ULL) {
> +                       filter_dbg(FERR, "Node1 ioremap failed for BAR2\n");
> +                       result = -EAGAIN;
> +                       goto error;
> +               } else {
> +                       filter_dbg(FINFO, "Node1 ioremap success for BAR2\n");
> +               }
> +       }
> +       msleep(1000);
> +       //We will replay register trace to initialize TNS block
> +       tns_replay_register_trace(fw, dev);
> +
> +       return 0;
> +error:
> +       if (iomem0 != 0)
> +               iounmap((void *)iomem0);
> +       if (iomem2 != 0)
> +               iounmap((void *)iomem2);
> +
> +       if (node1_iomem0 != 0)
> +               iounmap((void *)node1_iomem0);
> +       if (node1_iomem2 != 0)
> +               iounmap((void *)node1_iomem2);
> +
> +       return result;
> +}
> +
> +void tns_exit(void)
> +{
> +       int i;
> +
> +       if (iomem0 != 0)
> +               iounmap((void *)iomem0);
> +       if (iomem2 != 0)
> +               iounmap((void *)iomem2);
> +
> +       if (node1_iomem0 != 0)
> +               iounmap((void *)node1_iomem0);
> +       if (node1_iomem2 != 0)
> +               iounmap((void *)node1_iomem2);
> +
> +       for (i = 0; i < TNS_MAX_TABLE; i++) {
> +               if (!tbl_info[i].sdata.valid)
> +                       continue;
> +               kfree(tbl_info[i].ddata[0].bitmap);
> +               kfree(tbl_info[i].ddata[n1_tns].bitmap);
> +       }
> +}
> --
> 1.8.3.1
>

^ permalink raw reply

* Re: at86rf230: Allow slow GPIO pins for "rstn"
From: Nikita Yushchenko @ 2016-12-21 12:30 UTC (permalink / raw)
  To: linux-wpan
  Cc: Andrey Smirnov, Alexander Aring, netdev, linux-kernel,
	Chris Healy, Nikita Yushchenko
In-Reply-To: <1482103533-13187-1-git-send-email-andrew.smirnov@gmail.com>

> Driver code never touches "rstn" signal in atomic context, so there's
> no need to implicitly put such restriction on it by using gpio_set_value
> to manipulate it. Replace gpio_set_value to gpio_set_value_cansleep to
> fix that.
> 
> As a an example of where such restriction might be inconvenient,
> consider a hardware design where "rstn" is connected to a pin of I2C/SPI
> GPIO expander chip.
> 
> Cc: Chris Healy <cphealy@gmail.com>
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>

Reviewed-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>

^ permalink raw reply

* Re: ipv6: handle -EFAULT from skb_copy_bits
From: Hannes Frederic Sowa @ 2016-12-21 12:27 UTC (permalink / raw)
  To: Cong Wang, Dave Jones; +Cc: David Miller, Linux Kernel Network Developers
In-Reply-To: <CAM_iQpXWCWMKDjLVTLuHNSKKGq5MWf5ZRLpyE=PDD3EgQ7PS4Q@mail.gmail.com>

On Tue, 2016-12-20 at 22:09 -0800, Cong Wang wrote:
> On Tue, Dec 20, 2016 at 2:12 PM, Dave Jones <davej@codemonkey.org.uk> wrote:
> >         fd = socket(AF_INET6, SOCK_RAW, 7);
> > 
> >         setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4);
> >         setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN);
> > 
> 
> Interesting, you set the checksum offset to be 0, but the packet size
> is actually 49, transport header is located at offset 48, so apparently
> the packet doesn't have room for a 16bit checksum after network header.
> 
> Your original patch seems reasonable to me, unless there is some
> check in __ip6_append_data() which is supposed to catch this, but
> CHECKSUM is specific to raw socket only.

The calculation of total_len is wrong here:

	total_len = inet_sk(sk)->cork.base.length;
	if (offset >= total_len - 1) {
		err = -EINVAL;
		ip6_flush_pending_frames(sk);
		goto out;
	}


At least for this bug to fix we need to subtract the extension header
length after the fragmentation header, so:

--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -536,6 +536,17 @@ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
        goto out;
 }
 
+static unsigned int raw6_corked_transport_len(const struct sock *sk)
+{
+       unsigned int len = inet_sk(sk)->cork.base.length;
+       struct ipv6_txoptions *opt = inet6_sk(sk)->cork.opt;
+
+       if (likely(!opt))
+               return len;
+
+       return len - opt->opt_flen;
+}
+
 static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
                                     struct raw6_sock *rp)
 {
@@ -543,7 +554,7 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
        int err = 0;
        int offset;
        int len;
-       int total_len;
+       int transport_len;
        __wsum tmp_csum;
        __sum16 csum;
 
@@ -555,8 +566,8 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
                goto out;
 
        offset = rp->offset;
-       total_len = inet_sk(sk)->cork.base.length;
-       if (offset >= total_len - 1) {
+       transport_len = raw6_corked_transport_len(sk);
+       if (offset >= transport_len - 1) {
                err = -EINVAL;
                ip6_flush_pending_frames(sk);
                goto out;
@@ -598,7 +609,7 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
                tmp_csum = csum_sub(tmp_csum, csum_unfold(csum));
 
        csum = csum_ipv6_magic(&fl6->saddr, &fl6->daddr,
-                              total_len, fl6->flowi6_proto, tmp_csum);
+                              transport_len, fl6->flowi6_proto, tmp_csum);
 
        if (csum == 0 && fl6->flowi6_proto == IPPROTO_UDP)
                csum = CSUM_MANGLED_0;

^ permalink raw reply

* Re: [PATCH v3] net: macb: Added PCI wrapper for Platform Driver.
From: Andy Shevchenko @ 2016-12-21 12:22 UTC (permalink / raw)
  To: Bartosz Folta
  Cc: Nicolas Ferre, David S. Miller, Niklas Cassel, Alexandre Torgue,
	Satanand Burla, Raghu Vatsavayi, Simon Horman,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Rafal Ozieblo
In-Reply-To: <CAHp75VcD8uCoiXiMV2f7OrPUDysGQX_VFZM+9+1pzRSNkT883Q@mail.gmail.com>

On Wed, Dec 21, 2016 at 2:16 PM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Wed, Dec 21, 2016 at 1:53 PM, Bartosz Folta <bfolta@cadence.com> wrote:

>> I think that there is a delay on some mailing lists (marc.info and
>> mail-archive.com). Latest messages I found there are from December
>> 16th 2016.
>
> You may refer to
> https://www.spinics.net/lists/netdev/msg410864.html

And one more

+       plat_info.dma_mask = DMA_BIT_MASK(32);

Perhaps
 plat_info.dma_mask = pdev->dma_mask;
?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox