Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH RFC net-next 3/7] sctp: add dst_pending_confirm flag
From: Neil Horman @ 2016-12-21 14:11 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev, linux-sctp, YueHaibing
In-Reply-To: <1482094596-26046-4-git-send-email-ja@ssi.bg>

On Sun, Dec 18, 2016 at 10:56:32PM +0200, Julian Anastasov wrote:
> Add new transport flag to allow sockets to confirm neighbour.
> When same struct dst_entry can be used for many different
> neighbours we can not use it for pending confirmations.
> The flag is propagated from transport to every packet.
> It is reset when cached dst is reset.
> 
> Reported-by: YueHaibing <yuehaibing@huawei.com>
> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
> Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---
>  include/net/sctp/sctp.h    |  6 ++----
>  include/net/sctp/structs.h |  4 ++++
>  net/sctp/associola.c       |  3 +--
>  net/sctp/output.c          | 10 +++++++++-
>  net/sctp/outqueue.c        |  2 +-
>  net/sctp/sm_make_chunk.c   |  6 ++----
>  net/sctp/sm_sideeffect.c   |  2 +-
>  net/sctp/socket.c          |  4 ++--
>  net/sctp/transport.c       | 18 +++++++++++++++++-
>  9 files changed, 39 insertions(+), 16 deletions(-)
> 
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index f0dcaeb..85d9083 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -586,10 +586,8 @@ static inline void sctp_v4_map_v6(union sctp_addr *addr)
>   */
>  static inline struct dst_entry *sctp_transport_dst_check(struct sctp_transport *t)
>  {
> -	if (t->dst && !dst_check(t->dst, t->dst_cookie)) {
> -		dst_release(t->dst);
> -		t->dst = NULL;
> -	}
> +	if (t->dst && !dst_check(t->dst, t->dst_cookie))
> +		sctp_transport_dst_release(t);
>  
>  	return t->dst;
>  }
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 92daabd..e842e84 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -838,6 +838,8 @@ struct sctp_transport {
>  
>  	__u32 burst_limited;	/* Holds old cwnd when max.burst is applied */
>  
> +	__u32 dst_pending_confirm;	/* need to confirm neighbour */
> +
>  	/* Destination */
>  	struct dst_entry *dst;
>  	/* Source address. */
> @@ -980,6 +982,8 @@ void sctp_transport_route(struct sctp_transport *, union sctp_addr *,
>  void sctp_transport_reset(struct sctp_transport *);
>  void sctp_transport_update_pmtu(struct sock *, struct sctp_transport *, u32);
>  void sctp_transport_immediate_rtx(struct sctp_transport *);
> +void sctp_transport_dst_release(struct sctp_transport *t);
> +void sctp_transport_dst_confirm(struct sctp_transport *t);
>  
>  
>  /* This is the structure we use to queue packets as they come into
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 68428e1..7bd26e0 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -820,8 +820,7 @@ void sctp_assoc_control_transport(struct sctp_association *asoc,
>  		if (transport->state != SCTP_UNCONFIRMED)
>  			transport->state = SCTP_INACTIVE;
>  		else {
> -			dst_release(transport->dst);
> -			transport->dst = NULL;
> +			sctp_transport_dst_release(transport);
>  			ulp_notify = false;
>  		}
>  
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index f5320a8..4684a00 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -550,6 +550,7 @@ int sctp_packet_transmit(struct sctp_packet *packet, gfp_t gfp)
>  	struct sctp_association *asoc = tp->asoc;
>  	struct sctp_chunk *chunk, *tmp;
>  	int pkt_count, gso = 0;
> +	int confirm;
>  	struct dst_entry *dst;
>  	struct sk_buff *head;
>  	struct sctphdr *sh;
> @@ -628,7 +629,14 @@ int sctp_packet_transmit(struct sctp_packet *packet, gfp_t gfp)
>  			asoc->peer.last_sent_to = tp;
>  	}
>  	head->ignore_df = packet->ipfragok;
> -	tp->af_specific->sctp_xmit(head, tp);
> +	confirm = tp->dst_pending_confirm;
> +	if (confirm)
> +		head->dst_pending_confirm = 1;
Why not use your confirm helper function here?

> +	/* neighbour should be confirmed on successful transmission or
> +	 * positive error
> +	 */
> +	if (tp->af_specific->sctp_xmit(head, tp) >= 0 && confirm)
> +		tp->dst_pending_confirm = 0;
>  
>  out:
>  	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
> diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
> index e540826..8234799 100644
> --- a/net/sctp/outqueue.c
> +++ b/net/sctp/outqueue.c
> @@ -1641,7 +1641,7 @@ static void sctp_check_transmitted(struct sctp_outq *q,
>  
>  		if (forward_progress) {
>  			if (transport->dst)
> -				dst_confirm(transport->dst);
> +				sctp_transport_dst_confirm(transport);
>  		}
>  	}
>  
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index 9e9690b..6fb15bf 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -3317,8 +3317,7 @@ static void sctp_asconf_param_success(struct sctp_association *asoc,
>  		local_bh_enable();
>  		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
>  				transports) {
> -			dst_release(transport->dst);
> -			transport->dst = NULL;
> +			sctp_transport_dst_release(transport);
>  		}
>  		break;
>  	case SCTP_PARAM_DEL_IP:
> @@ -3332,8 +3331,7 @@ static void sctp_asconf_param_success(struct sctp_association *asoc,
>  		local_bh_enable();
>  		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
>  				transports) {
> -			dst_release(transport->dst);
> -			transport->dst = NULL;
> +			sctp_transport_dst_release(transport);
>  		}
>  		break;
>  	default:
> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
> index c345bf1..9255b29 100644
> --- a/net/sctp/sm_sideeffect.c
> +++ b/net/sctp/sm_sideeffect.c
> @@ -723,7 +723,7 @@ static void sctp_cmd_transport_on(sctp_cmd_seq_t *cmds,
>  	 * forward progress.
>  	 */
>  	if (t->dst)
> -		dst_confirm(t->dst);
> +		sctp_transport_dst_confirm(t);
>  
>  	/* The receiver of the HEARTBEAT ACK should also perform an
>  	 * RTT measurement for that destination transport address
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 318c678..9ee676a 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -588,7 +588,7 @@ static int sctp_send_asconf_add_ip(struct sock		*sk,
>  			list_for_each_entry(trans,
>  			    &asoc->peer.transport_addr_list, transports) {
>  				/* Clear the source and route cache */
> -				dst_release(trans->dst);
> +				sctp_transport_dst_release(trans);
>  				trans->cwnd = min(4*asoc->pathmtu, max_t(__u32,
>  				    2*asoc->pathmtu, 4380));
>  				trans->ssthresh = asoc->peer.i.a_rwnd;
> @@ -839,7 +839,7 @@ static int sctp_send_asconf_del_ip(struct sock		*sk,
>  		 */
>  		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
>  					transports) {
> -			dst_release(transport->dst);
> +			sctp_transport_dst_release(transport);
>  			sctp_transport_route(transport, NULL,
>  					     sctp_sk(asoc->base.sk));
>  		}
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index ce54dce..db66514 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -227,7 +227,7 @@ void sctp_transport_pmtu(struct sctp_transport *transport, struct sock *sk)
>  {
>  	/* If we don't have a fresh route, look one up */
>  	if (!transport->dst || transport->dst->obsolete) {
> -		dst_release(transport->dst);
> +		sctp_transport_dst_release(transport);
>  		transport->af_specific->get_dst(transport, &transport->saddr,
>  						&transport->fl, sk);
>  	}
> @@ -659,3 +659,19 @@ void sctp_transport_immediate_rtx(struct sctp_transport *t)
>  			sctp_transport_hold(t);
>  	}
>  }
> +
> +/* Drop dst */
> +void sctp_transport_dst_release(struct sctp_transport *t)
> +{
> +	dst_release(t->dst);
> +	t->dst = NULL;
> +	t->dst_pending_confirm = 0;
> +}
> +
> +/* Schedule neighbour confirm */
> +void sctp_transport_dst_confirm(struct sctp_transport *t)
> +{
> +	if (!t->dst_pending_confirm)
> +		t->dst_pending_confirm = 1;
I'm not sure why you check it for being zero before setting it here, just set it
unilaterally instead, or are you trying to avoid dirtying a cacheline?

Neil

> +}
> +
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCHv2 net 1/2] sctp: reduce indent level in sctp_copy_local_addr_list
From: Neil Horman @ 2016-12-21 14:16 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, davem, Marcelo Ricardo Leitner
In-Reply-To: <b4d6428c565955e9e81e3b4623a6458e7567e593.1482212764.git.lucien.xin@gmail.com>

On Tue, Dec 20, 2016 at 01:49:49PM +0800, Xin Long wrote:
> This patch is to reduce indent level by using continue when the addr
> is not allowed, and also drop end_copy by using break.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  net/sctp/protocol.c | 37 +++++++++++++++++++------------------
>  1 file changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 7b523e3..da5d82b 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -205,26 +205,27 @@ int sctp_copy_local_addr_list(struct net *net, struct sctp_bind_addr *bp,
>  	list_for_each_entry_rcu(addr, &net->sctp.local_addr_list, list) {
>  		if (!addr->valid)
>  			continue;
> -		if (sctp_in_scope(net, &addr->a, scope)) {
> -			/* Now that the address is in scope, check to see if
> -			 * the address type is really supported by the local
> -			 * sock as well as the remote peer.
> -			 */
> -			if ((((AF_INET == addr->a.sa.sa_family) &&
> -			      (copy_flags & SCTP_ADDR4_PEERSUPP))) ||
> -			    (((AF_INET6 == addr->a.sa.sa_family) &&
> -			      (copy_flags & SCTP_ADDR6_ALLOWED) &&
> -			      (copy_flags & SCTP_ADDR6_PEERSUPP)))) {
> -				error = sctp_add_bind_addr(bp, &addr->a,
> -						    sizeof(addr->a),
> -						    SCTP_ADDR_SRC, GFP_ATOMIC);
> -				if (error)
> -					goto end_copy;
> -			}
> -		}
> +		if (!sctp_in_scope(net, &addr->a, scope))
> +			continue;
> +
> +		/* Now that the address is in scope, check to see if
> +		 * the address type is really supported by the local
> +		 * sock as well as the remote peer.
> +		 */
> +		if (addr->a.sa.sa_family == AF_INET &&
> +		    !(copy_flags & SCTP_ADDR4_PEERSUPP))
> +			continue;
> +		if (addr->a.sa.sa_family == AF_INET6 &&
> +		    (!(copy_flags & SCTP_ADDR6_ALLOWED) ||
> +		     !(copy_flags & SCTP_ADDR6_PEERSUPP)))
> +			continue;
> +
> +		error = sctp_add_bind_addr(bp, &addr->a, sizeof(addr->a),
> +					   SCTP_ADDR_SRC, GFP_ATOMIC);
> +		if (error)
> +			break;
>  	}
>  
> -end_copy:
>  	rcu_read_unlock();
>  	return error;
>  }
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Series
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: Jason A. Donenfeld @ 2016-12-21 14:24 UTC (permalink / raw)
  To: George Spelvin
  Cc: Eric Dumazet, Andi Kleen, David Miller, David Laight,
	Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
	Jean-Philippe Aumasson, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, Vegard Nossum
In-Reply-To: <20161221063412.6425.qmail@ns.sciencehorizons.net>

Hi George,

On Wed, Dec 21, 2016 at 7:34 AM, George Spelvin
<linux@sciencehorizons.net> wrote:
> In fact, I have an idea.  Allow me to make the following concrete
> suggestion for using HalfSipHash with 128 bits of key material:
>
> - 64 bits are used as the key.
> - The other 64 bits are used as an IV which is prepended to
>   the message to be hashed.
>
> As a matter of practical implementation, we precompute the effect
> of hashing the IV and store the 128-bit HalfSipHash state, which
> is used just like a 128-bit key.
>
> Because of the way it is constructed, it is obviously no weaker than
> standard HalfSipHash's 64-bit security claim.
>
> I don't know the security of this, and it's almost certainly weaker than
> 128 bits, but I *hope* it's at least a few bits stronger than 64 bits.
> 80 would be enough to dissuade any attacker without a six-figure budget
> (that's per attack, not a one-time capital investment).  96 would be
> ample for our purposes.
>
> What I do know is that it makes a brute-force attack without
> significant cryptanalytic effort impossible.

Depends who's doing the cryptanalytic effort I guess.

Please don't roll your own crypto. It's a dangerous road. Putting
homebrew crypto into the kernel would be an error. Let's stick with
the constructions and security margins that the cryptographers give
us. JP made that fairly clear, I thought.

There are already people working on this problem who undergo peer
review and a career devoted to solving these problems. One result for
small systems that need 128-bit security is Chaskey, which you can go
read about if you're curious.

Jason

^ permalink raw reply

* Re: [PATCH][V2] qed: fix memory leak of a qed_spq_entry on error failure paths
From: Colin Ian King @ 2016-12-21 14:28 UTC (permalink / raw)
  To: Mintz, Yuval, netdev@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Elior, Ariel, Tayar, Tomer
In-Reply-To: <BL2PR07MB230615AAE3CF67DEBC6083778D930@BL2PR07MB2306.namprd07.prod.outlook.com>

On 21/12/16 13:29, Mintz, Yuval wrote:
>> From: Colin Ian King <colin.king@canonical.com>
>>
>> A qed_spq_entry entry is allocated by qed_sp_init_request but is not kfree'd
>> if an error occurs, causing a memory leak. Fix this by returning the previously
>> allocated spq entry and also setting *pp_ent to NULL to be safe.
>>
>> Thanks to Yuval Mintz for suggestions on how to improve my original fix.
>>
>> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> 
> We've given it a more thorough look, and apparently this isn't the correct fix.
> So I'll start by saying sorry for making you send this V2 needlessly.
> 
> It boils down to the fact there are two kinds of SPQ entries -
> Those originating from the 'free_pool' and those from the 'unlimited_pending'.
> Only those originating from the free_pool should be returned
> using the qed_spq_return_entry(), as only those actually point to a valid
> dma-mapped memory where FW expects to find the entries;
> Returning the other kind would lead to assertions later,
> as driver would post a ramrod to FW which actually points to address 0.
> 
> Looking at the error flows, it seems possible this isn't the only faulty
> error flow in the SPQ. I suggest you'd drop this and we'll take it from
> here [although if you really have the urge to continue - please do].
> 
> Thanks,
> Yuval
> 
> 
Sure, lets drop my fixes, I'm out of time on this for 2016 anyhow.

Colin

^ permalink raw reply

* Re: [PATCH 2/2] net: sfc: falcon: use new api ethtool_{get|set}_link_ksettings
From: Bert Kenward @ 2016-12-21 14:38 UTC (permalink / raw)
  To: Philippe Reynes, linux-net-drivers, ecree, davem, andrew
  Cc: netdev, linux-kernel
In-Reply-To: <1482272667-1206-2-git-send-email-tremyfr@gmail.com>

On 20/12/16 22:24, Philippe Reynes wrote:
> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
> ---
>  drivers/net/ethernet/sfc/falcon/efx.c          |    2 +-
>  drivers/net/ethernet/sfc/falcon/ethtool.c      |   35 ++++++++++++-------
>  drivers/net/ethernet/sfc/falcon/mdio_10g.c     |   44 +++++++++++++++---------
>  drivers/net/ethernet/sfc/falcon/mdio_10g.h     |    3 +-
>  drivers/net/ethernet/sfc/falcon/net_driver.h   |   12 +++---
>  drivers/net/ethernet/sfc/falcon/qt202x_phy.c   |    9 +++--
>  drivers/net/ethernet/sfc/falcon/tenxpress.c    |   22 ++++++------
>  drivers/net/ethernet/sfc/falcon/txc43128_phy.c |    9 +++--
>  8 files changed, 80 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sfc/falcon/efx.c b/drivers/net/ethernet/sfc/falcon/efx.c
> index 5c5cb3c..438ef9e 100644
> --- a/drivers/net/ethernet/sfc/falcon/efx.c
> +++ b/drivers/net/ethernet/sfc/falcon/efx.c
> @@ -986,7 +986,7 @@ void ef4_mac_reconfigure(struct ef4_nic *efx)
>  
>  /* Push loopback/power/transmit disable settings to the PHY, and reconfigure
>   * the MAC appropriately. All other PHY configuration changes are pushed
> - * through phy_op->set_settings(), and pushed asynchronously to the MAC
> + * through phy_op->set_link_ksettings(), and pushed asynchronously to the MAC
>   * through ef4_monitor().
>   *
>   * Callers must hold the mac_lock
> diff --git a/drivers/net/ethernet/sfc/falcon/ethtool.c b/drivers/net/ethernet/sfc/falcon/ethtool.c
> index 8e1929b..659ece7 100644
> --- a/drivers/net/ethernet/sfc/falcon/ethtool.c
> +++ b/drivers/net/ethernet/sfc/falcon/ethtool.c
> @@ -115,44 +115,53 @@ static int ef4_ethtool_phys_id(struct net_device *net_dev,
>  }
>  
>  /* This must be called with rtnl_lock held. */
> -static int ef4_ethtool_get_settings(struct net_device *net_dev,
> -				    struct ethtool_cmd *ecmd)
> +static int
> +ef4_ethtool_get_link_ksettings(struct net_device *net_dev,
> +			       struct ethtool_link_ksettings *cmd)
>  {
>  	struct ef4_nic *efx = netdev_priv(net_dev);
>  	struct ef4_link_state *link_state = &efx->link_state;
> +	u32 supported;
> +
> +	ethtool_convert_link_mode_to_legacy_u32(&supported,
> +						cmd->link_modes.supported);
>  
>  	mutex_lock(&efx->mac_lock);
> -	efx->phy_op->get_settings(efx, ecmd);
> +	efx->phy_op->get_link_ksettings(efx, cmd);
>  	mutex_unlock(&efx->mac_lock);
>  
>  	/* Both MACs support pause frames (bidirectional and respond-only) */
> -	ecmd->supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
> +	supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
>  
>  	if (LOOPBACK_INTERNAL(efx)) {
> -		ethtool_cmd_speed_set(ecmd, link_state->speed);
> -		ecmd->duplex = link_state->fd ? DUPLEX_FULL : DUPLEX_HALF;
> +		cmd->base.speed = link_state->speed;
> +		cmd->base.duplex = link_state->fd ? DUPLEX_FULL : DUPLEX_HALF;
>  	}
>  
> +	ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
> +						supported);
> +
>  	return 0;
>  }

This translates cmd->link_modes.supported in to the legacy supported
bitmap before we've actually read the values from the phy. Given we're
only using the legacy bitmap so we can add the pause frame settings, I
suggest something like:

ef4_ethtool_get_link_ksettings(struct net_device *net_dev,
			       struct ethtool_link_ksettings *cmd)
{
	struct ef4_nic *efx = netdev_priv(net_dev);
	struct ef4_link_state *link_state = &efx->link_state;

	mutex_lock(&efx->mac_lock);
	efx->phy_op->get_link_ksettings(efx, cmd);
	mutex_unlock(&efx->mac_lock);

	/* Both MACs support pause frames (bidirectional and respond-only) */
	ethtool_link_ksettings_add_link_mode(cmd, supported, Pause);
	ethtool_link_ksettings_add_link_mode(cmd, supported, Asym_Pause);

	if (LOOPBACK_INTERNAL(efx)) {
		cmd->base.speed = link_state->speed;
		cmd->base.duplex = link_state->fd ? DUPLEX_FULL : DUPLEX_HALF;
	}

	return 0;
}


Thanks,

Bert.

^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: Jason A. Donenfeld @ 2016-12-21 14:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: George Spelvin, Theodore Ts'o, Andi Kleen, David Miller,
	David Laight, Daniel J . Bernstein, Eric Biggers,
	Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Vegard Nossum
In-Reply-To: <1482298164.8944.8.camel@edumazet-glaptop3.roam.corp.google.com>

Hi Eric,

I computed performance numbers for both 32-bit and 64-bit using the
actual functions in which talking about replacing MD5 with SipHash.
The basic harness is here [1] if you're curious. SipHash was a pretty
clear winner for both cases.

x86_64:
[    1.714302] secure_tcpv6_sequence_number_md5# cycles: 102373398
[    1.747685] secure_tcp_sequence_number_md5# cycles: 92042258
[    1.773522] secure_tcpv6_sequence_number_siphash# cycles: 70786533
[    1.798701] secure_tcp_sequence_number_siphash# cycles: 68941043

x86:
[    1.635749] secure_tcpv6_sequence_number_md5# cycles: 106016335
[    1.670259] secure_tcp_sequence_number_md5# cycles: 95670512
[    1.708387] secure_tcpv6_sequence_number_siphash# cycles: 105988635
[    1.740264] secure_tcp_sequence_number_siphash# cycles: 88225395

>>> 102373398 > 70786533
True
>>> 92042258 > 68941043
True
>>> 106016335 > 105988635
True
>>> 95670512 > 88225395
True

While MD5 is probably faster for some kind of large-data
cycles-per-byte, due to its 64-byte internal state, SipHash -- the
"Sip" part standing "Short Input PRF" -- is fast for shorter inputs.
In practice with the functions we're talking about replacing, there's
no need to hash 64-bytes. So, SipHash comes out faster and more
secure.

I also haven't begun to look focusedly at the assembly my SipHash
implemention is generating, which means there's still window for even
more performance improvements.

Jason

[1] https://git.zx2c4.com/linux-dev/tree/net/core/secure_seq.c?h=siphash-bench#n194

^ permalink raw reply

* RE: [PATCH][V2] qed: fix memory leak of a qed_spq_entry on error failure paths
From: Mintz, Yuval @ 2016-12-21 13:29 UTC (permalink / raw)
  To: Colin King, netdev@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Elior, Ariel, Tayar, Tomer
In-Reply-To: <20161220114424.11244-1-colin.king@canonical.com>

> From: Colin Ian King <colin.king@canonical.com>
> 
> A qed_spq_entry entry is allocated by qed_sp_init_request but is not kfree'd
> if an error occurs, causing a memory leak. Fix this by returning the previously
> allocated spq entry and also setting *pp_ent to NULL to be safe.
> 
> Thanks to Yuval Mintz for suggestions on how to improve my original fix.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

We've given it a more thorough look, and apparently this isn't the correct fix.
So I'll start by saying sorry for making you send this V2 needlessly.

It boils down to the fact there are two kinds of SPQ entries -
Those originating from the 'free_pool' and those from the 'unlimited_pending'.
Only those originating from the free_pool should be returned
using the qed_spq_return_entry(), as only those actually point to a valid
dma-mapped memory where FW expects to find the entries;
Returning the other kind would lead to assertions later,
as driver would post a ramrod to FW which actually points to address 0.

Looking at the error flows, it seems possible this isn't the only faulty
error flow in the SPQ. I suggest you'd drop this and we'll take it from
here [although if you really have the urge to continue - please do].

Thanks,
Yuval

^ permalink raw reply

* Re: [PATCH 1/5 net-next] inet: replace ->bind_conflict with ->rcv_saddr_equal
From: Hannes Frederic Sowa @ 2016-12-21 15:06 UTC (permalink / raw)
  To: Josef Bacik, davem, kraigatgoog, eric.dumazet, tom, netdev,
	kernel-team
In-Reply-To: <1482264424-15439-2-git-send-email-jbacik@fb.com>

On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
> The only difference between inet6_csk_bind_conflict and inet_csk_bind_conflict
> is how they check the rcv_saddr.  Since we want to be able to check the saddr in
> other places just drop the protocol specific ->bind_conflict and replace it with
> ->rcv_saddr_equal, then make inet_csk_bind_conflict the one true bind conflict
> function.
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> 



> ---
>  include/net/inet6_connection_sock.h |  5 -----
>  include/net/inet_connection_sock.h  |  9 +++------
>  net/dccp/ipv4.c                     |  3 ++-
>  net/dccp/ipv6.c                     |  2 +-
>  net/ipv4/inet_connection_sock.c     | 22 +++++++-------------
>  net/ipv4/tcp_ipv4.c                 |  3 ++-
>  net/ipv4/udp.c                      |  1 +
>  net/ipv6/inet6_connection_sock.c    | 40 -------------------------------------
>  net/ipv6/tcp_ipv6.c                 |  4 ++--
>  9 files changed, 18 insertions(+), 71 deletions(-)
> 
> diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
> index 3212b39..8ec87b6 100644
> --- a/include/net/inet6_connection_sock.h
> +++ b/include/net/inet6_connection_sock.h
> @@ -15,16 +15,11 @@
>  
>  #include <linux/types.h>
>  
> -struct inet_bind_bucket;
>  struct request_sock;
>  struct sk_buff;
>  struct sock;
>  struct sockaddr;
>  
> -int inet6_csk_bind_conflict(const struct sock *sk,
> -			    const struct inet_bind_bucket *tb, bool relax,
> -			    bool soreuseport_ok);
> -
>  struct dst_entry *inet6_csk_route_req(const struct sock *sk, struct flowi6 *fl6,
>  				      const struct request_sock *req, u8 proto);
>  
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index ec0479a..9cd43c5 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -62,9 +62,9 @@ struct inet_connection_sock_af_ops {
>  				char __user *optval, int __user *optlen);
>  #endif
>  	void	    (*addr2sockaddr)(struct sock *sk, struct sockaddr *);
> -	int	    (*bind_conflict)(const struct sock *sk,
> -				     const struct inet_bind_bucket *tb,
> -				     bool relax, bool soreuseport_ok);
> +	int         (*rcv_saddr_equal)(const struct sock *sk1,
> +				       const struct sock *sk2,
> +				       bool match_wildcard);
>  	void	    (*mtu_reduced)(struct sock *sk);
>  };
>  
> 

The patch looks as a nice code cleanup already!

Have you looked if we can simply have one rcv_saddr_equal for both ipv4
and ipv6 that e.g. uses sk->sk_family instead of function pointers?
This could give us even more possibilities to remove some indirect
functions calls and thus might relieve some cycles?

Thanks,
Hannes

^ permalink raw reply

* Re: [PATCH 3/5 net-next] inet: don't check for bind conflicts twice when searching for a port
From: Hannes Frederic Sowa @ 2016-12-21 15:08 UTC (permalink / raw)
  To: Josef Bacik, davem, kraigatgoog, eric.dumazet, tom, netdev,
	kernel-team
In-Reply-To: <1482264424-15439-4-git-send-email-jbacik@fb.com>

On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -92,7 +92,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
>  {
>  	bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
>  	struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
> -	int ret = 1, attempts = 5, port = snum;
> +	int ret = 1, port = snum;
>  	struct inet_bind_hashbucket *head;
>  	struct net *net = sock_net(sk);
>  	int i, low, high, attempt_half;
> @@ -100,6 +100,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
>  	kuid_t uid = sock_i_uid(sk);
>  	u32 remaining, offset;
>  	bool reuseport_ok = !!snum;
> +	bool empty_tb = true;
>  
>  	if (port) {
>  		head = &hinfo->bhash[inet_bhashfn(net, port,
> @@ -111,7 +112,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
>  
>  		goto tb_not_found;
>  	}
> -again:
>  	attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
>  other_half_scan:
>  	inet_get_local_port_range(net, &low, &high);
> @@ -148,8 +148,12 @@ other_parity_scan:
>  		spin_lock_bh(&head->lock);
>  		inet_bind_bucket_for_each(tb, &head->chain)
>  			if (net_eq(ib_net(tb), net) && tb->port == port) {
> -				if (!inet_csk_bind_conflict(sk, tb, false, reuseport_ok))
> -					goto tb_found;
> +				if (hlist_empty(&tb->owners))
> +					goto success;
> +				if (!inet_csk_bind_conflict(sk, tb, false, reuseport_ok)) {
> +					empty_tb = false;
> +					goto success;
> +				}
>  				goto next_port;
>  			}
>  		goto tb_not_found;
> @@ -184,23 +188,12 @@ tb_found:
>  		      !rcu_access_pointer(sk->sk_reuseport_cb) &&
>  		      sk->sk_reuseport && uid_eq(tb->fastuid, uid)))
>  			goto success;
> -		if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok)) {
> -			if ((reuse ||
> -			     (tb->fastreuseport > 0 &&
> -			      sk->sk_reuseport &&
> -			      !rcu_access_pointer(sk->sk_reuseport_cb) &&
> -			      uid_eq(tb->fastuid, uid))) && !snum &&
> -			    --attempts >= 0) {
> -				spin_unlock_bh(&head->lock);
> -				goto again;
> -			}
> +		if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok))
>  			goto fail_unlock;
> -		}
> -		if (!reuse)
> -			tb->fastreuse = 0;
> -		if (!sk->sk_reuseport || !uid_eq(tb->fastuid, uid))
> -			tb->fastreuseport = 0;
> -	} else {
> +		empty_tb = false;
> +	}
> +success:
> +	if (empty_tb) {


I would fine it even more simple to read, if you redo the hlist_empty
check here instead of someone has to review all the paths where this
might get set. hlist_empty is a very quick test.


Thanks,
Hannes

^ permalink raw reply

* Re: [PATCH 3/5 net-next] inet: don't check for bind conflicts twice when searching for a port
From: Josef Bacik @ 2016-12-21 15:12 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: davem, kraigatgoog, eric.dumazet, tom, netdev, kernel-team
In-Reply-To: <1482332912.2260.8.camel@stressinduktion.org>

On Wed, Dec 21, 2016 at 10:08 AM, Hannes Frederic Sowa 
<hannes@stressinduktion.org> wrote:
> On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
>>  --- a/net/ipv4/inet_connection_sock.c
>>  +++ b/net/ipv4/inet_connection_sock.c
>>  @@ -92,7 +92,7 @@ int inet_csk_get_port(struct sock *sk, unsigned 
>> short snum)
>>   {
>>   	bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
>>   	struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
>>  -	int ret = 1, attempts = 5, port = snum;
>>  +	int ret = 1, port = snum;
>>   	struct inet_bind_hashbucket *head;
>>   	struct net *net = sock_net(sk);
>>   	int i, low, high, attempt_half;
>>  @@ -100,6 +100,7 @@ int inet_csk_get_port(struct sock *sk, unsigned 
>> short snum)
>>   	kuid_t uid = sock_i_uid(sk);
>>   	u32 remaining, offset;
>>   	bool reuseport_ok = !!snum;
>>  +	bool empty_tb = true;
>> 
>>   	if (port) {
>>   		head = &hinfo->bhash[inet_bhashfn(net, port,
>>  @@ -111,7 +112,6 @@ int inet_csk_get_port(struct sock *sk, unsigned 
>> short snum)
>> 
>>   		goto tb_not_found;
>>   	}
>>  -again:
>>   	attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
>>   other_half_scan:
>>   	inet_get_local_port_range(net, &low, &high);
>>  @@ -148,8 +148,12 @@ other_parity_scan:
>>   		spin_lock_bh(&head->lock);
>>   		inet_bind_bucket_for_each(tb, &head->chain)
>>   			if (net_eq(ib_net(tb), net) && tb->port == port) {
>>  -				if (!inet_csk_bind_conflict(sk, tb, false, reuseport_ok))
>>  -					goto tb_found;
>>  +				if (hlist_empty(&tb->owners))
>>  +					goto success;
>>  +				if (!inet_csk_bind_conflict(sk, tb, false, reuseport_ok)) {
>>  +					empty_tb = false;
>>  +					goto success;
>>  +				}
>>   				goto next_port;
>>   			}
>>   		goto tb_not_found;
>>  @@ -184,23 +188,12 @@ tb_found:
>>   		      !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>   		      sk->sk_reuseport && uid_eq(tb->fastuid, uid)))
>>   			goto success;
>>  -		if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok)) {
>>  -			if ((reuse ||
>>  -			     (tb->fastreuseport > 0 &&
>>  -			      sk->sk_reuseport &&
>>  -			      !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>  -			      uid_eq(tb->fastuid, uid))) && !snum &&
>>  -			    --attempts >= 0) {
>>  -				spin_unlock_bh(&head->lock);
>>  -				goto again;
>>  -			}
>>  +		if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok))
>>   			goto fail_unlock;
>>  -		}
>>  -		if (!reuse)
>>  -			tb->fastreuse = 0;
>>  -		if (!sk->sk_reuseport || !uid_eq(tb->fastuid, uid))
>>  -			tb->fastreuseport = 0;
>>  -	} else {
>>  +		empty_tb = false;
>>  +	}
>>  +success:
>>  +	if (empty_tb) {
> 
> 
> I would fine it even more simple to read, if you redo the hlist_empty
> check here instead of someone has to review all the paths where this
> might get set. hlist_empty is a very quick test.

Yup that's fair, I'll fix that up.  Thanks,

Josef

^ permalink raw reply

* Re: [PATCH 1/5 net-next] inet: replace ->bind_conflict with ->rcv_saddr_equal
From: Josef Bacik @ 2016-12-21 15:16 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: davem, kraigatgoog, eric.dumazet, tom, netdev, kernel-team
In-Reply-To: <1482332814.2260.6.camel@stressinduktion.org>

On Wed, Dec 21, 2016 at 10:06 AM, Hannes Frederic Sowa 
<hannes@stressinduktion.org> wrote:
> On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
>>  The only difference between inet6_csk_bind_conflict and 
>> inet_csk_bind_conflict
>>  is how they check the rcv_saddr.  Since we want to be able to check 
>> the saddr in
>>  other places just drop the protocol specific ->bind_conflict and 
>> replace it with
>>  ->rcv_saddr_equal, then make inet_csk_bind_conflict the one true 
>> bind conflict
>>  function.
>> 
>>  Signed-off-by: Josef Bacik <jbacik@fb.com>
>> 
> 
> 
> 
>>  ---
>>   include/net/inet6_connection_sock.h |  5 -----
>>   include/net/inet_connection_sock.h  |  9 +++------
>>   net/dccp/ipv4.c                     |  3 ++-
>>   net/dccp/ipv6.c                     |  2 +-
>>   net/ipv4/inet_connection_sock.c     | 22 +++++++-------------
>>   net/ipv4/tcp_ipv4.c                 |  3 ++-
>>   net/ipv4/udp.c                      |  1 +
>>   net/ipv6/inet6_connection_sock.c    | 40 
>> -------------------------------------
>>   net/ipv6/tcp_ipv6.c                 |  4 ++--
>>   9 files changed, 18 insertions(+), 71 deletions(-)
>> 
>>  diff --git a/include/net/inet6_connection_sock.h 
>> b/include/net/inet6_connection_sock.h
>>  index 3212b39..8ec87b6 100644
>>  --- a/include/net/inet6_connection_sock.h
>>  +++ b/include/net/inet6_connection_sock.h
>>  @@ -15,16 +15,11 @@
>> 
>>   #include <linux/types.h>
>> 
>>  -struct inet_bind_bucket;
>>   struct request_sock;
>>   struct sk_buff;
>>   struct sock;
>>   struct sockaddr;
>> 
>>  -int inet6_csk_bind_conflict(const struct sock *sk,
>>  -			    const struct inet_bind_bucket *tb, bool relax,
>>  -			    bool soreuseport_ok);
>>  -
>>   struct dst_entry *inet6_csk_route_req(const struct sock *sk, 
>> struct flowi6 *fl6,
>>   				      const struct request_sock *req, u8 proto);
>> 
>>  diff --git a/include/net/inet_connection_sock.h 
>> b/include/net/inet_connection_sock.h
>>  index ec0479a..9cd43c5 100644
>>  --- a/include/net/inet_connection_sock.h
>>  +++ b/include/net/inet_connection_sock.h
>>  @@ -62,9 +62,9 @@ struct inet_connection_sock_af_ops {
>>   				char __user *optval, int __user *optlen);
>>   #endif
>>   	void	    (*addr2sockaddr)(struct sock *sk, struct sockaddr *);
>>  -	int	    (*bind_conflict)(const struct sock *sk,
>>  -				     const struct inet_bind_bucket *tb,
>>  -				     bool relax, bool soreuseport_ok);
>>  +	int         (*rcv_saddr_equal)(const struct sock *sk1,
>>  +				       const struct sock *sk2,
>>  +				       bool match_wildcard);
>>   	void	    (*mtu_reduced)(struct sock *sk);
>>   };
>> 
>> 
> 
> The patch looks as a nice code cleanup already!
> 
> Have you looked if we can simply have one rcv_saddr_equal for both 
> ipv4
> and ipv6 that e.g. uses sk->sk_family instead of function pointers?
> This could give us even more possibilities to remove some indirect
> functions calls and thus might relieve some cycles?

I was going to do that but I'm not familiar enough with how sockets 
work to be comfortable.  My main concern is we have the ipv6_only() 
check on a socket, which seems to indicate to me that you can have a 
socket that can do both ipv4/ipv6, so what if we're specifically going 
through the ipv6 code, but we aren't ipv6_only() and we end up doing 
the ipv4 address compare when we really need to do the ipv6 address 
compare?  If this can't happen (and honestly as I type it out it sounds 
crazier than it did in my head) then yeah I'll totally do that as well 
and we can just have a global function without the protocol specific 
callbacks, but I need you or somebody to tell me I'm crazy and that it 
would be ok to have it all in one function.  Thanks,

Josef

^ permalink raw reply

* Re: [PATCH 1/5 net-next] inet: replace ->bind_conflict with ->rcv_saddr_equal
From: Hannes Frederic Sowa @ 2016-12-21 15:23 UTC (permalink / raw)
  To: Josef Bacik, davem, kraigatgoog, eric.dumazet, tom, netdev,
	kernel-team
In-Reply-To: <1482264424-15439-2-git-send-email-jbacik@fb.com>

On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
> --- a/net/dccp/ipv6.c
> +++ b/net/dccp/ipv6.c
> @@ -926,7 +926,7 @@ static const struct inet_connection_sock_af_ops dccp_ipv6_af_ops = {
>  	.getsockopt	   = ipv6_getsockopt,
>  	.addr2sockaddr	   = inet6_csk_addr2sockaddr,
>  	.sockaddr_len	   = sizeof(struct sockaddr_in6),
> -	.bind_conflict	   = inet6_csk_bind_conflict,
> +	.rcv_saddr_equal   = ipv6_rcv_saddr_equal,
>  #ifdef CONFIG_COMPAT
>  	.compat_setsockopt = compat_ipv6_setsockopt,
>  	.compat_getsockopt = compat_ipv6_getsockopt,
> 

Btw, small nit, you forgot the corresponding changes in
dccp_ipv6_mapped, thus causing this compiler error:

net/dccp/ipv6.c:961:2: error: unknown field ‘bind_conflict’ specified in initializer
  .bind_conflict    = inet6_csk_bind_conflict,
  ^
net/dccp/ipv6.c:961:22: error: ‘inet6_csk_bind_conflict’ undeclared here (not in a function)
  .bind_conflict    = inet6_csk_bind_conflict,
                      ^~~~~~~~~~~~~~~~~~~~~~~
scripts/Makefile.build:293: recipe for target 'net/dccp/ipv6.o' failed

Bye,
Hannes

^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: George Spelvin @ 2016-12-21 15:55 UTC (permalink / raw)
  To: Jason, linux
  Cc: ak, davem, David.Laight, djb, ebiggers3, eric.dumazet, hannes,
	jeanphilippe.aumasson, kernel-hardening, linux-crypto,
	linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum
In-Reply-To: <CAHmME9oJDOLpPKRpX=N+DY9BuzTueWjTaWzeVtdVMBG7mcrqKA@mail.gmail.com>

Actually, DJB just made a very relevant suggestion.

As I've mentioned, the 32-bit performance problems are an x86-specific
problem.  ARM does very well, and other processors aren't bad at all.

SipHash fits very nicely (and runs very fast) in the MMX registers.

They're 64 bits, and there are 8 of them, so the integer registers can
be reserved for pointers and loop counters and all that.  And there's
reference code available.

How much does kernel_fpu_begin()/kernel_fpu_end() cost?

Although there are a lot of pre-MMX x86es in embedded control applications,
I don't think anyone is worried about their networking performance.
(Specifically, all of this affects only connection setup, not throughput 
on established connections.)

^ permalink raw reply

* RE: [PATCH net] tcp: add a missing barrier in tcp_tasklet_func()
From: Madalin-Cristian Bucur @ 2016-12-21 14:23 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: netdev, Eric Dumazet, Miffy Lei
In-Reply-To: <1482327763.8944.26.camel@edumazet-glaptop3.roam.corp.google.com>

> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Wednesday, December 21, 2016 3:43 PM
> 
> Madalin reported crashes happening in tcp_tasklet_func() on powerpc64
> 
> Before TSQ_QUEUED bit is cleared, we must ensure the changes done
> by list_del(&tp->tsq_node); are committed to memory, otherwise
> corruption might happen, as an other cpu could catch TSQ_QUEUED
> clearance too soon.
> 
> We can notice that old kernels were immune to this bug, because
> TSQ_QUEUED was cleared after a bh_lock_sock(sk)/bh_unlock_sock(sk)
> section, but they could have missed a kick to write additional bytes,
> when NIC interrupts for a given flow are spread to multiple cpus.
> 
> Affected TCP flows would need an incoming ACK or RTO timer to add more
> packets to the pipe. So overall situation should be better now.
> 
> Fixes: b223feb9de2a ("tcp: tsq: add shortcut in tcp_tasklet_func()")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Madalin Bucur <madalin.bucur@nxp.com>
> Tested-by: Madalin Bucur <madalin.bucur@nxp.com>

It's actually tested by Xing Lei:

Tested-by: Xing Lei <xing.lei@nxp.com>

Thank you for the fast resolution.

> ---
>  net/ipv4/tcp_output.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index
> b45101f3d2bd2e0f0077305a061add4f7ea0de27..31a255b555ad86a3537c077862e3ea38
> f9b44284 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -769,6 +769,7 @@ static void tcp_tasklet_func(unsigned long data)
>  		list_del(&tp->tsq_node);
> 
>  		sk = (struct sock *)tp;
> +		smp_mb__before_atomic();
>  		clear_bit(TSQ_QUEUED, &sk->sk_tsq_flags);
> 
>  		if (!sk->sk_lock.owned &&
> 


^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: Eric Dumazet @ 2016-12-21 15:56 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: George Spelvin, Theodore Ts'o, Andi Kleen, David Miller,
	David Laight, Daniel J . Bernstein, Eric Biggers,
	Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Vegard Nossum
In-Reply-To: <CAHmME9oCfCwAq1qP09uAN6vvakh4wXDMHunsL9D7W_LDeN_OQQ@mail.gmail.com>

On Wed, 2016-12-21 at 15:42 +0100, Jason A. Donenfeld wrote:
> Hi Eric,
> 
> I computed performance numbers for both 32-bit and 64-bit using the
> actual functions in which talking about replacing MD5 with SipHash.
> The basic harness is here [1] if you're curious. SipHash was a pretty
> clear winner for both cases.
> 
> x86_64:
> [    1.714302] secure_tcpv6_sequence_number_md5# cycles: 102373398
> [    1.747685] secure_tcp_sequence_number_md5# cycles: 92042258
> [    1.773522] secure_tcpv6_sequence_number_siphash# cycles: 70786533
> [    1.798701] secure_tcp_sequence_number_siphash# cycles: 68941043
> 
> x86:
> [    1.635749] secure_tcpv6_sequence_number_md5# cycles: 106016335
> [    1.670259] secure_tcp_sequence_number_md5# cycles: 95670512
> [    1.708387] secure_tcpv6_sequence_number_siphash# cycles: 105988635
> [    1.740264] secure_tcp_sequence_number_siphash# cycles: 88225395
> 
> >>> 102373398 > 70786533
> True
> >>> 92042258 > 68941043
> True
> >>> 106016335 > 105988635
> True
> >>> 95670512 > 88225395
> True
> 
> While MD5 is probably faster for some kind of large-data
> cycles-per-byte, due to its 64-byte internal state, SipHash -- the
> "Sip" part standing "Short Input PRF" -- is fast for shorter inputs.
> In practice with the functions we're talking about replacing, there's
> no need to hash 64-bytes. So, SipHash comes out faster and more
> secure.
> 
> I also haven't begun to look focusedly at the assembly my SipHash
> implemention is generating, which means there's still window for even
> more performance improvements.
> 
> Jason
> 
> 
> [1] https://git.zx2c4.com/linux-dev/tree/net/core/secure_seq.c?h=siphash-bench#n194

Now I am quite confused.

George said :

> Cycles per byte on 1024 bytes of data:
>                       Pentium Core 2  Ivy
>                       4       Duo     Bridge
> SipHash-2-4           38.9     8.3     5.8
> HalfSipHash-2-4               12.7     4.5     3.2
> MD5                    8.3     5.7     4.7


That really was for 1024 bytes blocks, so pretty much useless for our
discussion ?

Reading your numbers last week, I thought SipHash was faster, but George
numbers are giving the opposite impression.

I do not have a P4 to make tests, so I only can trust you or George.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/5 net-next] inet: replace ->bind_conflict with ->rcv_saddr_equal
From: Josef Bacik @ 2016-12-21 15:59 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: davem, kraigatgoog, eric.dumazet, tom, netdev, kernel-team
In-Reply-To: <1482333786.2260.10.camel@stressinduktion.org>

On Wed, Dec 21, 2016 at 10:23 AM, Hannes Frederic Sowa 
<hannes@stressinduktion.org> wrote:
> On Tue, 2016-12-20 at 15:07 -0500, Josef Bacik wrote:
>>  --- a/net/dccp/ipv6.c
>>  +++ b/net/dccp/ipv6.c
>>  @@ -926,7 +926,7 @@ static const struct inet_connection_sock_af_ops 
>> dccp_ipv6_af_ops = {
>>   	.getsockopt	   = ipv6_getsockopt,
>>   	.addr2sockaddr	   = inet6_csk_addr2sockaddr,
>>   	.sockaddr_len	   = sizeof(struct sockaddr_in6),
>>  -	.bind_conflict	   = inet6_csk_bind_conflict,
>>  +	.rcv_saddr_equal   = ipv6_rcv_saddr_equal,
>>   #ifdef CONFIG_COMPAT
>>   	.compat_setsockopt = compat_ipv6_setsockopt,
>>   	.compat_getsockopt = compat_ipv6_getsockopt,
>> 
> 
> Btw, small nit, you forgot the corresponding changes in
> dccp_ipv6_mapped, thus causing this compiler error:
> 
> net/dccp/ipv6.c:961:2: error: unknown field ‘bind_conflict’ 
> specified in initializer
>   .bind_conflict    = inet6_csk_bind_conflict,
>   ^
> net/dccp/ipv6.c:961:22: error: ‘inet6_csk_bind_conflict’ 
> undeclared here (not in a function)
>   .bind_conflict    = inet6_csk_bind_conflict,
>                       ^~~~~~~~~~~~~~~~~~~~~~~
> scripts/Makefile.build:293: recipe for target 'net/dccp/ipv6.o' failed

Yeah kbuild caught that yesterday and I have it fixed up in my git tree 
already, thanks,

Josef

^ permalink raw reply

* [PATCH] net: fddi: skfp: use %p format specifier for addresses rather than %x
From: Colin King @ 2016-12-21 16:03 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Trivial fix: Addresses should be printed using the %p format specifier
rather than using %x.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/fddi/skfp/hwmtm.c | 12 ++++++------
 drivers/net/fddi/skfp/pmf.c   |  2 +-
 drivers/net/fddi/skfp/smt.c   |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/fddi/skfp/hwmtm.c b/drivers/net/fddi/skfp/hwmtm.c
index e26398b..d0a68bd 100644
--- a/drivers/net/fddi/skfp/hwmtm.c
+++ b/drivers/net/fddi/skfp/hwmtm.c
@@ -1483,7 +1483,7 @@ void mac_drv_clear_rx_queue(struct s_smc *smc)
 	r = queue->rx_curr_get ;
 	while (queue->rx_used) {
 		DRV_BUF_FLUSH(r,DDI_DMA_SYNC_FORCPU) ;
-		DB_RX("switch OWN bit of RxD 0x%x ",r,0,5) ;
+		DB_RX("switch OWN bit of RxD 0x%p ",r,0,5) ;
 		r->rxd_rbctrl &= ~cpu_to_le32(BMU_OWN) ;
 		frag_count = 1 ;
 		DRV_BUF_FLUSH(r,DDI_DMA_SYNC_FORDEV) ;
@@ -1645,7 +1645,7 @@ void hwm_tx_frag(struct s_smc *smc, char far *virt, u_long phys, int len,
 	DB_TX("hwm_tx_frag: len = %d, frame_status = %x ",len,frame_status,2) ;
 	if (frame_status & LAN_TX) {
 		/* '*t' is already defined */
-		DB_TX("LAN_TX: TxD = %x, virt = %x ",t,virt,3) ;
+		DB_TX("LAN_TX: TxD = %p, virt = %p ",t,virt,3) ;
 		t->txd_virt = virt ;
 		t->txd_txdscr = cpu_to_le32(smc->os.hwm.tx_descr) ;
 		t->txd_tbadr = cpu_to_le32(phys) ;
@@ -1819,7 +1819,7 @@ void smt_send_mbuf(struct s_smc *smc, SMbuf *mb, int fc)
 	__le32	tbctrl;
 
 	NDD_TRACE("THSB",mb,fc,0) ;
-	DB_TX("smt_send_mbuf: mb = 0x%x, fc = 0x%x",mb,fc,4) ;
+	DB_TX("smt_send_mbuf: mb = 0x%p, fc = 0x%x",mb,fc,4) ;
 
 	mb->sm_off-- ;	/* set to fc */
 	mb->sm_len++ ;	/* + fc */
@@ -1960,7 +1960,7 @@ static void mac_drv_clear_txd(struct s_smc *smc)
 
 			do {
 				DRV_BUF_FLUSH(t1,DDI_DMA_SYNC_FORCPU) ;
-				DB_TX("check OWN/EOF bit of TxD 0x%x",t1,0,5) ;
+				DB_TX("check OWN/EOF bit of TxD 0x%p",t1,0,5) ;
 				tbctrl = le32_to_cpu(CR_READ(t1->txd_tbctrl));
 
 				if (tbctrl & BMU_OWN || !queue->tx_used){
@@ -1988,7 +1988,7 @@ static void mac_drv_clear_txd(struct s_smc *smc)
 			}
 			else {
 #ifndef PASS_1ST_TXD_2_TX_COMP
-				DB_TX("mac_drv_tx_comp for TxD 0x%x",t2,0,4) ;
+				DB_TX("mac_drv_tx_comp for TxD 0x%p",t2,0,4) ;
 				mac_drv_tx_complete(smc,t2) ;
 #else
 				DB_TX("mac_drv_tx_comp for TxD 0x%x",
@@ -2052,7 +2052,7 @@ void mac_drv_clear_tx_queue(struct s_smc *smc)
 		tx_used = queue->tx_used ;
 		while (tx_used) {
 			DRV_BUF_FLUSH(t,DDI_DMA_SYNC_FORCPU) ;
-			DB_TX("switch OWN bit of TxD 0x%x ",t,0,5) ;
+			DB_TX("switch OWN bit of TxD 0x%p ",t,0,5) ;
 			t->txd_tbctrl &= ~cpu_to_le32(BMU_OWN) ;
 			DRV_BUF_FLUSH(t,DDI_DMA_SYNC_FORDEV) ;
 			t = t->txd_next ;
diff --git a/drivers/net/fddi/skfp/pmf.c b/drivers/net/fddi/skfp/pmf.c
index 441b4dc..52fa162 100644
--- a/drivers/net/fddi/skfp/pmf.c
+++ b/drivers/net/fddi/skfp/pmf.c
@@ -284,7 +284,7 @@ void smt_pmf_received_pack(struct s_smc *smc, SMbuf *mb, int local)
 	SMbuf		*reply ;
 
 	sm = smtod(mb,struct smt_header *) ;
-	DB_SMT("SMT: processing PMF frame at %x len %d\n",sm,mb->sm_len) ;
+	DB_SMT("SMT: processing PMF frame at %p len %d\n",sm,mb->sm_len) ;
 #ifdef	DEBUG
 	dump_smt(smc,sm,"PMF Received") ;
 #endif
diff --git a/drivers/net/fddi/skfp/smt.c b/drivers/net/fddi/skfp/smt.c
index cd78b7c..e80a089 100644
--- a/drivers/net/fddi/skfp/smt.c
+++ b/drivers/net/fddi/skfp/smt.c
@@ -504,7 +504,7 @@ void smt_received_pack(struct s_smc *smc, SMbuf *mb, int fs)
 #endif
 
 	smt_swap_para(sm,(int) mb->sm_len,1) ;
-	DB_SMT("SMT : received packet [%s] at 0x%x\n",
+	DB_SMT("SMT : received packet [%s] at 0x%p\n",
 		smt_type_name[m_fc(mb) & 0xf],sm) ;
 	DB_SMT("SMT : version %d, class %s\n",sm->smt_version,
 		smt_class_name[(sm->smt_class>LAST_CLASS)?0 : sm->smt_class]) ;
-- 
2.10.2

^ permalink raw reply related

* Re: [PATCH net-next 00/27] net: mvpp2: add basic support for PPv2.2
From: David Miller @ 2016-12-21 16:03 UTC (permalink / raw)
  To: thomas.petazzoni
  Cc: netdev, nadavh, hannah, yehuday, jason, andrew,
	sebastian.hesselbarth, gregory.clement, linux-arm-kernel, stefanc,
	mw
In-Reply-To: <1482318994-23488-1-git-send-email-thomas.petazzoni@free-electrons.com>

Two things:

1) net-next is closed, please do not submit net-next material during
   this time.

2) 27 patches is way too many to submit at one time, please keep your
   patch series submissions to a small, reasonable size.

Thanks.

^ permalink raw reply

* Re: [PATCH perf/core REBASE 3/5] tools lib bpf: Add bpf_prog_{attach,detach}
From: Arnaldo Carvalho de Melo @ 2016-12-21 16:06 UTC (permalink / raw)
  To: Joe Stringer; +Cc: LKML, netdev, Wang Nan, ast, Daniel Borkmann
In-Reply-To: <CAPWQB7HyzaAck1LEX_ec7gRpoKjyaZPZ+dco3Ca=sR4qQ403BQ@mail.gmail.com>

Em Tue, Dec 20, 2016 at 10:50:22AM -0800, Joe Stringer escreveu:
> On 20 December 2016 at 06:32, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Em Tue, Dec 20, 2016 at 11:18:51AM -0300, Arnaldo Carvalho de Melo escreveu:
> >> This one makes it fail for CentOS 5 and 6, others may fail as well,
> >> still building, investigating...
> >
> > Ok, fixed it by making it follow the model of the other sys_bpf wrappers
> > setting up that bpf_attr union wrt initializing unamed struct members:
> > -       union bpf_attr attr = {
> > -               .target_fd = target_fd,
> > -       };
> > +       union bpf_attr attr;
> > +
> > +       bzero(&attr, sizeof(attr));
> > +       attr.target_fd     = target_fd;

> Ah, I just shifted these across originally so the delta would be
> minimal but now I know why this code is like this. Thanks.

np, making sure this code works in all those environments requires
automation, I'd say its impossible otherwise, too many details :-\

Fixed, pushed, merged, should hit 4.10 soon :-)

- Arnaldo

^ permalink raw reply

* Re: [PATCH] net: macb: Applied changes according to review made by Andy Shevchenko.
From: David Miller @ 2016-12-21 16:08 UTC (permalink / raw)
  To: bfolta
  Cc: nicolas.ferre, niklas.cassel, alexandre.torgue, satananda.burla,
	rvatsavayi, simon.horman, linux-kernel, netdev, rafalo
In-Reply-To: <SN1PR0701MB1951B994F86160A24C436F30CC930@SN1PR0701MB1951.namprd07.prod.outlook.com>

From: Bartosz Folta <bfolta@cadence.com>
Date: Wed, 21 Dec 2016 08:53:05 +0000

> This patch is sent in regard to recently applied patch
> Commit 83a77e9ec4150ee4acc635638f7dedd9da523a26
> net: macb: Added PCI wrapper for Platform Driver.
> Andy's review does not appear on mailing lists. I can't reference it.
> 
> Signed-off-by: Bartosz Folta <bfolta@cadence.com>

This commit log message and Subject line need a lot of improvement.

You should be saying what exactly you are doing, rather than "someone
said I should do this".  The latter gives the reader no information
whatsoever about what the patch is doing.

^ permalink raw reply

* Re: [PATCH v3 net-next 0/3] Add support for the ethernet switch on the ESPRESSObin
From: David Miller @ 2016-12-21 16:10 UTC (permalink / raw)
  To: romain.perier
  Cc: andrew, vivien.didelot, f.fainelli, jason, sebastian.hesselbarth,
	gregory.clement, netdev, devicetree, robh+dt, ijc+devicetree,
	pawel.moll, mark.rutland, galak, linux-arm-kernel,
	thomas.petazzoni, nadavh
In-Reply-To: <20161221090045.474-1-romain.perier@free-electrons.com>


net-next is not open, please do not submit net-next changes during this
time.

^ permalink raw reply

* Re: [PATCH net-next 00/27] net: mvpp2: add basic support for PPv2.2
From: Thomas Petazzoni @ 2016-12-21 16:12 UTC (permalink / raw)
  To: David Miller
  Cc: andrew, yehuday, jason, netdev, hannah, nadavh, gregory.clement,
	stefanc, mw, linux-arm-kernel, sebastian.hesselbarth
In-Reply-To: <20161221.110348.346806544020316686.davem@davemloft.net>

Hello,

On Wed, 21 Dec 2016 11:03:48 -0500 (EST), David Miller wrote:

> 1) net-next is closed, please do not submit net-next material during
>    this time.

I did not expect you to review these patches right now, as I know
net-next is closed. However I expect other people in the net community
and people interested in Marvell platforms to look at those patches,
potentially test them and give feedback.

> 2) 27 patches is way too many to submit at one time, please keep your
>    patch series submissions to a small, reasonable size.

OK, I will do some arbitrary cut in the series and send several
smaller series instead.

Thanks for the feedback,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH 2/3] NFC: trf7970a: Add device tree option of 1.8 Volt IO voltage
From: Mark Greer @ 2016-12-21 16:13 UTC (permalink / raw)
  To: Geoff Lansberry
  Cc: linux-wireless, lauro.venancio, aloisio.almeida, sameo, robh+dt,
	mark.rutland, netdev, devicetree, linux-kernel, justin
In-Reply-To: <32FAB08E-BE8E-4127-80A6-013300B43BD0@kuvee.com>

On Wed, Dec 21, 2016 at 06:47:36AM -0500, Geoff Lansberry wrote:
> Thanks Mark.   Should I resubmit patches with the requested edits today, or wait a bit for more comments?  What is the desired etiquette?

Its up to you.  I don't think there are any hard & fast rules on this.

If it were me, I would likely spin a new version today because there are
several responses already and it lets people review them at their leisure
over the holidays.

Just a thought - you may want to consider separating the third patch from
the other two.  The problems the first two solve are well understood and
have reasonable solutions (I believe).  The third one - for me, at least -
tries to fix a problem that is not well understood yet.

Mark
--

^ permalink raw reply

* Re: [PATCH net-next 00/27] net: mvpp2: add basic support for PPv2.2
From: David Miller @ 2016-12-21 16:24 UTC (permalink / raw)
  To: thomas.petazzoni
  Cc: netdev, nadavh, hannah, yehuday, jason, andrew,
	sebastian.hesselbarth, gregory.clement, linux-arm-kernel, stefanc,
	mw
In-Reply-To: <20161221171246.77e91d2c@free-electrons.com>

From: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Date: Wed, 21 Dec 2016 17:12:46 +0100

> Hello,
> 
> On Wed, 21 Dec 2016 11:03:48 -0500 (EST), David Miller wrote:
> 
>> 1) net-next is closed, please do not submit net-next material during
>>    this time.
> 
> I did not expect you to review these patches right now, as I know
> net-next is closed. However I expect other people in the net community
> and people interested in Marvell platforms to look at those patches,
> potentially test them and give feedback.

Then mark the patches as "RFC".

^ permalink raw reply

* Re: HalfSipHash Acceptable Usage
From: Jason A. Donenfeld @ 2016-12-21 16:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: George Spelvin, Theodore Ts'o, Andi Kleen, David Miller,
	David Laight, Daniel J . Bernstein, Eric Biggers,
	Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Vegard Nossum
In-Reply-To: <1482335804.8944.44.camel@edumazet-glaptop3.roam.corp.google.com>

Hi Eric,

On Wed, Dec 21, 2016 at 4:56 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> That really was for 1024 bytes blocks, so pretty much useless for our
> discussion ?
>
> Reading your numbers last week, I thought SipHash was faster, but George
> numbers are giving the opposite impression.
>
> I do not have a P4 to make tests, so I only can trust you or George.

I'm not sure how George came up with those numbers, but the ones I
sent are output from that benchmark function in the last email. I'd be
interested in learning this too.

As mentioned in the last email, it looks like potential 32-bit issues
are really just specific to old Intel chips. Other 32-bit
architectures do fine. So, for new kernels, even if somehow there is a
tiny performance regression (though I couldn't see one) on old
architectures, I really doubt it will affect anybody in practice.

Jason

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox