From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH net-next] net: sock_rps_record_flow() is for connected sockets
Date: Wed, 07 Dec 2016 08:59:11 +0100 [thread overview]
Message-ID: <1481097551.5535.14.camel@redhat.com> (raw)
In-Reply-To: <1481081570.18162.626.camel@edumazet-glaptop3.roam.corp.google.com>
On Tue, 2016-12-06 at 19:32 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Paolo noticed a cache line miss in UDP recvmsg() to access
> sk_rxhash, sharing a cache line with sk_drops.
>
> sk_drops might be heavily incremented by cpus handling a flood targeting
> this socket.
>
> We might place sk_drops on a separate cache line, but lets try
> to avoid wasting 64 bytes per socket just for this, since we have
> other bottlenecks to take care of.
>
> sock_rps_record_flow() should only access sk_rxhash for connected
> flows.
>
> Testing sk_state for TCP_ESTABLISHED covers most of the cases for
> connected sockets, for a zero cost, since system calls using
> sock_rps_record_flow() also access sk->sk_prot which is on the
> same cache line.
>
> A follow up patch will provide a static_key (Jump Label) since most
> hosts do not even use RFS.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> ---
> include/net/sock.h | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 6dfe3aa22b970eecfab4d4a0753804b1cc82a200..a7ddab993b496f1f4060f0b41831a161c284df9e 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -913,7 +913,17 @@ static inline void sock_rps_record_flow_hash(__u32 hash)
> static inline void sock_rps_record_flow(const struct sock *sk)
> {
> #ifdef CONFIG_RPS
> - sock_rps_record_flow_hash(sk->sk_rxhash);
> + /* Reading sk->sk_rxhash might incur an expensive cache line miss.
> + *
> + * TCP_ESTABLISHED does cover almost all states where RFS
> + * might be useful, and is cheaper [1] than testing :
> + * IPv4: inet_sk(sk)->inet_daddr
> + * IPv6: ipv6_addr_any(&sk->sk_v6_daddr)
> + * OR an additional socket flag
> + * [1] : sk_state and sk_prot are in the same cache line.
> + */
> + if (sk->sk_state == TCP_ESTABLISHED)
> + sock_rps_record_flow_hash(sk->sk_rxhash);
> #endif
> }
Thank you for the very prompt patch!
You made me curious about your other idea on this topic, this what you
initially talked about, right ?
LGTM.
Acked-by: Paolo Abeni <pabeni@redhat.com>
next prev parent reply other threads:[~2016-12-07 7:59 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-05 2:43 [RFC] udp: some improvements on RX path Eric Dumazet
2016-12-05 13:22 ` Paolo Abeni
2016-12-05 14:28 ` Eric Dumazet
2016-12-05 15:37 ` Jesper Dangaard Brouer
2016-12-05 15:54 ` Eric Dumazet
2016-12-05 17:57 ` [PATCH] net/udp: do not touch skb->peeked unless really needed Eric Dumazet
2016-12-06 9:53 ` Paolo Abeni
2016-12-06 12:10 ` Paolo Abeni
2016-12-06 14:35 ` Eric Dumazet
2016-12-06 14:34 ` Eric Dumazet
2016-12-06 10:34 ` Paolo Abeni
2016-12-06 17:08 ` Paolo Abeni
2016-12-06 17:47 ` Eric Dumazet
2016-12-06 18:31 ` Paolo Abeni
2016-12-06 18:58 ` Eric Dumazet
2016-12-06 19:16 ` Paolo Abeni
2016-12-06 19:35 ` Eric Dumazet
2016-12-07 3:32 ` [PATCH net-next] net: sock_rps_record_flow() is for connected sockets Eric Dumazet
2016-12-07 6:47 ` Eric Dumazet
2016-12-07 7:57 ` Paolo Abeni
2016-12-07 14:26 ` Eric Dumazet
2016-12-08 17:49 ` Paolo Abeni
2016-12-07 14:29 ` Eric Dumazet
2016-12-07 15:59 ` Eric Dumazet
2016-12-08 18:50 ` Paolo Abeni
2016-12-08 19:32 ` Eric Dumazet
2016-12-08 19:20 ` Edward Cree
2016-12-08 17:49 ` Tom Herbert
2016-12-08 18:02 ` Eric Dumazet
2016-12-08 19:15 ` Tom Herbert
2016-12-08 20:05 ` Hannes Frederic Sowa
2016-12-08 20:30 ` Tom Herbert
2016-12-08 20:44 ` Tom Herbert
2016-12-08 18:07 ` Eric Dumazet
2016-12-07 7:59 ` Paolo Abeni [this message]
2016-12-07 13:58 ` Eric Dumazet
2016-12-07 15:47 ` David Miller
2016-12-07 17:09 ` [PATCH] net/udp: do not touch skb->peeked unless really needed David Laight
2016-12-07 17:32 ` Eric Dumazet
2016-12-07 17:37 ` Hannes Frederic Sowa
2016-12-07 17:52 ` Eric Dumazet
2016-12-07 17:55 ` Eric Dumazet
2016-12-06 15:42 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1481097551.5535.14.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).