All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH net-next] net: sock_rps_record_flow() is for connected sockets
Date: Wed, 07 Dec 2016 08:59:11 +0100	[thread overview]
Message-ID: <1481097551.5535.14.camel@redhat.com> (raw)
In-Reply-To: <1481081570.18162.626.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, 2016-12-06 at 19:32 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Paolo noticed a cache line miss in UDP recvmsg() to access
> sk_rxhash, sharing a cache line with sk_drops.
> 
> sk_drops might be heavily incremented by cpus handling a flood targeting
> this socket.
> 
> We might place sk_drops on a separate cache line, but lets try
> to avoid wasting 64 bytes per socket just for this, since we have
> other bottlenecks to take care of.
> 
> sock_rps_record_flow() should only access sk_rxhash for connected
> flows.
> 
> Testing sk_state for TCP_ESTABLISHED covers most of the cases for
> connected sockets, for a zero cost, since system calls using
> sock_rps_record_flow() also access sk->sk_prot which is on the
> same cache line.
> 
> A follow up patch will provide a static_key (Jump Label) since most
> hosts do not even use RFS.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/sock.h |   12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 6dfe3aa22b970eecfab4d4a0753804b1cc82a200..a7ddab993b496f1f4060f0b41831a161c284df9e 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -913,7 +913,17 @@ static inline void sock_rps_record_flow_hash(__u32 hash)
>  static inline void sock_rps_record_flow(const struct sock *sk)
>  {
>  #ifdef CONFIG_RPS
> -	sock_rps_record_flow_hash(sk->sk_rxhash);
> +	/* Reading sk->sk_rxhash might incur an expensive cache line miss.
> +	 *
> +	 * TCP_ESTABLISHED does cover almost all states where RFS
> +	 * might be useful, and is cheaper [1] than testing :
> +	 *	IPv4: inet_sk(sk)->inet_daddr
> +	 * 	IPv6: ipv6_addr_any(&sk->sk_v6_daddr)
> +	 * OR	an additional socket flag
> +	 * [1] : sk_state and sk_prot are in the same cache line.
> +	 */
> +	if (sk->sk_state == TCP_ESTABLISHED)
> +		sock_rps_record_flow_hash(sk->sk_rxhash);
>  #endif
>  }

Thank you for the very prompt patch!

You made me curious about your other idea on this topic, this what you
initially talked about, right ?

LGTM.

Acked-by: Paolo Abeni <pabeni@redhat.com>

  parent reply	other threads:[~2016-12-07  7:59 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  2:43 [RFC] udp: some improvements on RX path Eric Dumazet
2016-12-05 13:22 ` Paolo Abeni
2016-12-05 14:28   ` Eric Dumazet
2016-12-05 15:37     ` Jesper Dangaard Brouer
2016-12-05 15:54       ` Eric Dumazet
2016-12-05 17:57     ` [PATCH] net/udp: do not touch skb->peeked unless really needed Eric Dumazet
2016-12-06  9:53       ` Paolo Abeni
2016-12-06 12:10         ` Paolo Abeni
2016-12-06 14:35           ` Eric Dumazet
2016-12-06 14:34         ` Eric Dumazet
2016-12-06 10:34       ` Paolo Abeni
2016-12-06 17:08         ` Paolo Abeni
2016-12-06 17:47           ` Eric Dumazet
2016-12-06 18:31             ` Paolo Abeni
2016-12-06 18:58               ` Eric Dumazet
2016-12-06 19:16                 ` Paolo Abeni
2016-12-06 19:35                   ` Eric Dumazet
2016-12-07  3:32                     ` [PATCH net-next] net: sock_rps_record_flow() is for connected sockets Eric Dumazet
2016-12-07  6:47                       ` Eric Dumazet
2016-12-07  7:57                         ` Paolo Abeni
2016-12-07 14:26                           ` Eric Dumazet
2016-12-08 17:49                             ` Paolo Abeni
2016-12-07 14:29                           ` Eric Dumazet
2016-12-07 15:59                             ` Eric Dumazet
2016-12-08 18:50                             ` Paolo Abeni
2016-12-08 19:32                               ` Eric Dumazet
2016-12-08 19:20                           ` Edward Cree
2016-12-08 17:49                         ` Tom Herbert
2016-12-08 18:02                           ` Eric Dumazet
2016-12-08 19:15                             ` Tom Herbert
2016-12-08 20:05                               ` Hannes Frederic Sowa
2016-12-08 20:30                                 ` Tom Herbert
2016-12-08 20:44                                 ` Tom Herbert
2016-12-08 18:07                           ` Eric Dumazet
2016-12-07  7:59                       ` Paolo Abeni [this message]
2016-12-07 13:58                         ` Eric Dumazet
2016-12-07 15:47                       ` David Miller
2016-12-07 17:09           ` [PATCH] net/udp: do not touch skb->peeked unless really needed David Laight
2016-12-07 17:32             ` Eric Dumazet
2016-12-07 17:37               ` Hannes Frederic Sowa
2016-12-07 17:52                 ` Eric Dumazet
2016-12-07 17:55                 ` Eric Dumazet
2016-12-06 15:42       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1481097551.5535.14.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.