All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
Date: Tue, 06 Dec 2016 19:31:01 +0100	[thread overview]
Message-ID: <1481049061.7129.18.camel@redhat.com> (raw)
In-Reply-To: <1481046434.18162.599.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, 2016-12-06 at 09:47 -0800, Eric Dumazet wrote:
> On Tue, 2016-12-06 at 18:08 +0100, Paolo Abeni wrote:
> > On Tue, 2016-12-06 at 11:34 +0100, Paolo Abeni wrote:
> > > On Mon, 2016-12-05 at 09:57 -0800, Eric Dumazet wrote:
> > > > From: Eric Dumazet <edumazet@google.com>
> > > > 
> > > > In UDP recvmsg() path we currently access 3 cache lines from an skb
> > > > while holding receive queue lock, plus another one if packet is
> > > > dequeued, since we need to change skb->next->prev
> > > > 
> > > > 1st cache line (contains ->next/prev pointers, offsets 0x00 and 0x08)
> > > > 2nd cache line (skb->len & skb->peeked, offsets 0x80 and 0x8e)
> > > > 3rd cache line (skb->truesize/users, offsets 0xe0 and 0xe4)
> > > > 
> > > > skb->peeked is only needed to make sure 0-length packets are properly
> > > > handled while MSG_PEEK is operated.
> > > > 
> > > > I had first the intent to remove skb->peeked but the "MSG_PEEK at
> > > > non-zero offset" support added by Sam Kumar makes this not possible.
> > > > 
> > > > This patch avoids one cache line miss during the locked section, when
> > > > skb->len and skb->peeked do not have to be read.
> > > > 
> > > > It also avoids the skb_set_peeked() cost for non empty UDP datagrams.
> > > > 
> > > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > 
> > > Thank you for all the good work.
> > > 
> > > After all your improvement, I see the cacheline miss in inet_recvmsg()
> > > as a major perf offender for the user space process in the udp flood
> > > scenario due to skc_rxhash sharing the same sk_drops cacheline.
> > > 
> > > Using an udp-specific drop counter (and an sk_drops accessor to wrap
> > > sk_drops access where needed), we could avoid such cache miss. With that
> > > - patch for udp.h only below - I get 3% improvement on top of all the
> > > pending udp patches, and the gain should be more relevant after the 2
> > > queues rework. What do you think ?
> > 
> > Here follow what I'm experimenting. 
> 
> Well, new socket layout makes this kind of patches not really needed ?
> 
> 	/* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
> 	socket_lock_t              sk_lock;              /*  0x88  0x20 */
> 	atomic_t                   sk_drops;             /*  0xa8   0x4 */
> 	int                        sk_rcvlowat;          /*  0xac   0x4 */
> 	struct sk_buff_head        sk_error_queue;       /*  0xb0  0x18 */
> 	/* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */

cacheline 2 boundary (128 bytes) is 8 bytes before sk_lock: cacheline 2
includes also skc_refcnt and skc_rxhash from __sk_common (I use 'pahole
-E ...' to get the full blown output). skc_rxhash is read for each
packet in inet_recvmsg()/sock_rps_record_flow() if CONFIG_RPS is set. I
get a cache miss per packet there and inet_recvmsg() in my test takes
about 8% of the whole u/s processing time.

> I mentioned at some point that we can very easily instruct 
> sock_skb_set_dropcount() to not read sk_drops if application
> does not care about getting sk_drops ;)
> 
> https://patchwork.kernel.org/patch/9405677/
> 
> Now sk_drops was moved, the plan is to submit this patch in an official way.

Looking forward to that!

Thank you,

Paolo

  reply	other threads:[~2016-12-06 18:31 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  2:43 [RFC] udp: some improvements on RX path Eric Dumazet
2016-12-05 13:22 ` Paolo Abeni
2016-12-05 14:28   ` Eric Dumazet
2016-12-05 15:37     ` Jesper Dangaard Brouer
2016-12-05 15:54       ` Eric Dumazet
2016-12-05 17:57     ` [PATCH] net/udp: do not touch skb->peeked unless really needed Eric Dumazet
2016-12-06  9:53       ` Paolo Abeni
2016-12-06 12:10         ` Paolo Abeni
2016-12-06 14:35           ` Eric Dumazet
2016-12-06 14:34         ` Eric Dumazet
2016-12-06 10:34       ` Paolo Abeni
2016-12-06 17:08         ` Paolo Abeni
2016-12-06 17:47           ` Eric Dumazet
2016-12-06 18:31             ` Paolo Abeni [this message]
2016-12-06 18:58               ` Eric Dumazet
2016-12-06 19:16                 ` Paolo Abeni
2016-12-06 19:35                   ` Eric Dumazet
2016-12-07  3:32                     ` [PATCH net-next] net: sock_rps_record_flow() is for connected sockets Eric Dumazet
2016-12-07  6:47                       ` Eric Dumazet
2016-12-07  7:57                         ` Paolo Abeni
2016-12-07 14:26                           ` Eric Dumazet
2016-12-08 17:49                             ` Paolo Abeni
2016-12-07 14:29                           ` Eric Dumazet
2016-12-07 15:59                             ` Eric Dumazet
2016-12-08 18:50                             ` Paolo Abeni
2016-12-08 19:32                               ` Eric Dumazet
2016-12-08 19:20                           ` Edward Cree
2016-12-08 17:49                         ` Tom Herbert
2016-12-08 18:02                           ` Eric Dumazet
2016-12-08 19:15                             ` Tom Herbert
2016-12-08 20:05                               ` Hannes Frederic Sowa
2016-12-08 20:30                                 ` Tom Herbert
2016-12-08 20:44                                 ` Tom Herbert
2016-12-08 18:07                           ` Eric Dumazet
2016-12-07  7:59                       ` Paolo Abeni
2016-12-07 13:58                         ` Eric Dumazet
2016-12-07 15:47                       ` David Miller
2016-12-07 17:09           ` [PATCH] net/udp: do not touch skb->peeked unless really needed David Laight
2016-12-07 17:32             ` Eric Dumazet
2016-12-07 17:37               ` Hannes Frederic Sowa
2016-12-07 17:52                 ` Eric Dumazet
2016-12-07 17:55                 ` Eric Dumazet
2016-12-06 15:42       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1481049061.7129.18.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.