Re: [PATCH] net/udp: do not touch skb->peeked unless really needed

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
Date: Tue, 06 Dec 2016 10:53:09 +0100	[thread overview]
Message-ID: <1481017989.6225.21.camel@redhat.com> (raw)
In-Reply-To: <1480960639.18162.556.camel@edumazet-glaptop3.roam.corp.google.com>

Hi Eric,

On Mon, 2016-12-05 at 09:57 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> In UDP recvmsg() path we currently access 3 cache lines from an skb
> while holding receive queue lock, plus another one if packet is
> dequeued, since we need to change skb->next->prev
> 
> 1st cache line (contains ->next/prev pointers, offsets 0x00 and 0x08)
> 2nd cache line (skb->len & skb->peeked, offsets 0x80 and 0x8e)
> 3rd cache line (skb->truesize/users, offsets 0xe0 and 0xe4)
> 
> skb->peeked is only needed to make sure 0-length packets are properly
> handled while MSG_PEEK is operated.
> 
> I had first the intent to remove skb->peeked but the "MSG_PEEK at
> non-zero offset" support added by Sam Kumar makes this not possible.

I'm wondering if peeking with offset is going to complicate the 2 queues
patch, too.

> This patch avoids one cache line miss during the locked section, when
> skb->len and skb->peeked do not have to be read.
> 
> It also avoids the skb_set_peeked() cost for non empty UDP datagrams.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/core/datagram.c |   19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index 49816af8586bb832e806972b486588041a99524c..9482037a5c8c64aec79e42c65bd2691bdd9450a3 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -214,6 +214,7 @@ struct sk_buff *__skb_try_recv_datagram(struct sock *sk, unsigned int flags,
>  	if (error)
>  		goto no_packet;
>  
> +	*peeked = 0;
>  	do {
>  		/* Again only user level code calls this function, so nothing
>  		 * interrupt level will suddenly eat the receive_queue.
> @@ -227,22 +228,22 @@ struct sk_buff *__skb_try_recv_datagram(struct sock *sk, unsigned int flags,
>  		spin_lock_irqsave(&queue->lock, cpu_flags);
>  		skb_queue_walk(queue, skb) {
>  			*last = skb;
> -			*peeked = skb->peeked;
>  			if (flags & MSG_PEEK) {
>  				if (_off >= skb->len && (skb->len || _off ||
>  							 skb->peeked)) {
>  					_off -= skb->len;
>  					continue;
>  				}
> -
> -				skb = skb_set_peeked(skb);
> -				error = PTR_ERR(skb);
> -				if (IS_ERR(skb)) {
> -					spin_unlock_irqrestore(&queue->lock,
> -							       cpu_flags);
> -					goto no_packet;
> +				if (!skb->len) {
> +					skb = skb_set_peeked(skb);
> +					if (IS_ERR(skb)) {
> +						error = PTR_ERR(skb);
> +						spin_unlock_irqrestore(&queue->lock,
> +								       cpu_flags);
> +						goto no_packet;
> +					}
>  				}

I don't understand why we can avoid setting skb->peek if len > 0. I
think that will change the kernel behavior if:
- peek with offset is set
- 3 skbs with len > 0 are enqueued
- the u/s peek (with offset) the second one
- the u/s disable peeking with offset and peeks 2 more skbs.

With the current code in the last step the u/s is going to peek the 1#
and the 3# skbs, after this patch will peek the 1# and the 2#. Am I
missing something ? Probably the new behavior is more correct, but still
is a change. 

I gave this a run in my test bed on top of your udp-related patches I
see additional ~3 improvement in the udp flood scenario, and a bit more
in the un-contended scenario.

Thank you,

Paolo

next prev parent reply	other threads:[~2016-12-06  9:55 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  2:43 [RFC] udp: some improvements on RX path Eric Dumazet
2016-12-05 13:22 ` Paolo Abeni
2016-12-05 14:28   ` Eric Dumazet
2016-12-05 15:37     ` Jesper Dangaard Brouer
2016-12-05 15:54       ` Eric Dumazet
2016-12-05 17:57     ` [PATCH] net/udp: do not touch skb->peeked unless really needed Eric Dumazet
2016-12-06  9:53       ` Paolo Abeni [this message]
2016-12-06 12:10         ` Paolo Abeni
2016-12-06 14:35           ` Eric Dumazet
2016-12-06 14:34         ` Eric Dumazet
2016-12-06 10:34       ` Paolo Abeni
2016-12-06 17:08         ` Paolo Abeni
2016-12-06 17:47           ` Eric Dumazet
2016-12-06 18:31             ` Paolo Abeni
2016-12-06 18:58               ` Eric Dumazet
2016-12-06 19:16                 ` Paolo Abeni
2016-12-06 19:35                   ` Eric Dumazet
2016-12-07  3:32                     ` [PATCH net-next] net: sock_rps_record_flow() is for connected sockets Eric Dumazet
2016-12-07  6:47                       ` Eric Dumazet
2016-12-07  7:57                         ` Paolo Abeni
2016-12-07 14:26                           ` Eric Dumazet
2016-12-08 17:49                             ` Paolo Abeni
2016-12-07 14:29                           ` Eric Dumazet
2016-12-07 15:59                             ` Eric Dumazet
2016-12-08 18:50                             ` Paolo Abeni
2016-12-08 19:32                               ` Eric Dumazet
2016-12-08 19:20                           ` Edward Cree
2016-12-08 17:49                         ` Tom Herbert
2016-12-08 18:02                           ` Eric Dumazet
2016-12-08 19:15                             ` Tom Herbert
2016-12-08 20:05                               ` Hannes Frederic Sowa
2016-12-08 20:30                                 ` Tom Herbert
2016-12-08 20:44                                 ` Tom Herbert
2016-12-08 18:07                           ` Eric Dumazet
2016-12-07  7:59                       ` Paolo Abeni
2016-12-07 13:58                         ` Eric Dumazet
2016-12-07 15:47                       ` David Miller
2016-12-07 17:09           ` [PATCH] net/udp: do not touch skb->peeked unless really needed David Laight
2016-12-07 17:32             ` Eric Dumazet
2016-12-07 17:37               ` Hannes Frederic Sowa
2016-12-07 17:52                 ` Eric Dumazet
2016-12-07 17:55                 ` Eric Dumazet
2016-12-06 15:42       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1481017989.6225.21.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).