Netdev List
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Flavio Leitner <fbl@redhat.com>, netdev@vger.kernel.org
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Paolo Abeni <pabeni@redhat.com>,
	David Miller <davem@davemloft.net>,
	Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org
Subject: Re: [PATCH v2 net-next 2/2] skbuff: preserve sock reference when scrubbing the skb.
Date: Wed, 27 Jun 2018 06:56:52 -0700	[thread overview]
Message-ID: <eb07f943-9033-8da6-7958-8ecdc94d25ac@gmail.com> (raw)
In-Reply-To: <20180627133426.3858-3-fbl@redhat.com>



On 06/27/2018 06:34 AM, Flavio Leitner wrote:
> The sock reference is lost when scrubbing the packet and that breaks
> TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing
> performance impacts of about 50% in a single TCP stream when crossing
> network namespaces.
> 
> XPS breaks because the queue mapping stored in the socket is not
> available, so another random queue might be selected when the stack
> needs to transmit something like a TCP ACK, or TCP Retransmissions.
> That causes packet re-ordering and/or performance issues.

Note we do not care if another random queue is selected when TCP retransmit
after timeout happens.

The problem is really when sending a normal train of packets (being retransmission
or not). We want all of them going through one queue to avoid reorders.

After an idle period (no packets are in any qdisc/NIC queue), we absolutely
are okay to select another "random queue".

This choice is driven by skb->ooo_okay

Most TCP ACK packets are sent while no prior packet is in qdisc, so should
have ooo_okay set to 1.

> 
> TSQ breaks because it orphans the packet while it is still in the
> host, so packets are queued contributing to the buffer bloat problem.
> 
> Preserving the sock reference fixes both issues. The socket is
> orphaned anyways in the receiving path before any relevant action
> and on TX side the netfilter checks if the reference is local before
> use it.
> 
> Signed-off-by: Flavio Leitner <fbl@redhat.com>
> ---
>  Documentation/networking/ip-sysctl.txt | 10 +++++-----
>  net/core/skbuff.c                      |  1 -
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
> index ce8fbf5aa63c..f4c042be0216 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -733,11 +733,11 @@ tcp_limit_output_bytes - INTEGER
>  	Controls TCP Small Queue limit per tcp socket.
>  	TCP bulk sender tends to increase packets in flight until it
>  	gets losses notifications. With SNDBUF autotuning, this can
> -	result in a large amount of packets queued in qdisc/device
> -	on the local machine, hurting latency of other flows, for
> -	typical pfifo_fast qdiscs.
> -	tcp_limit_output_bytes limits the number of bytes on qdisc
> -	or device to reduce artificial RTT/cwnd and reduce bufferbloat.
> +	result in a large amount of packets queued on the local machine
> +	(e.g.: qdiscs, CPU backlog, or device) hurting latency of other
> +	flows, for typical pfifo_fast qdiscs.  tcp_limit_output_bytes
> +	limits the number of bytes on qdisc or device to reduce artificial
> +	RTT/cwnd and reduce bufferbloat.
>  	Default: 262144
>  
>  tcp_challenge_ack_limit - INTEGER
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index b1f274f22d85..f59e98ca72c5 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4911,7 +4911,6 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
>  		return;
>  
>  	ipvs_reset(skb);
> -	skb_orphan(skb);
>  	skb->mark = 0;
>  }
>  EXPORT_SYMBOL_GPL(skb_scrub_packet);
> 

  reply	other threads:[~2018-06-27 13:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-27 13:34 [PATCH v2 net-next 0/2] net: preserve sock reference when scrubbing the skb Flavio Leitner
2018-06-27 13:34 ` [PATCH v2 net-next 1/2] netfilter: check if the socket netns is correct Flavio Leitner
2018-06-27 14:22   ` Florian Westphal
2018-06-27 13:34 ` [PATCH v2 net-next 2/2] skbuff: preserve sock reference when scrubbing the skb Flavio Leitner
2018-06-27 13:56   ` Eric Dumazet [this message]
2018-06-27 19:39 ` [PATCH v2 net-next 0/2] net: " Cong Wang
2018-06-28 13:20   ` David Miller
2018-06-28 21:41     ` Cong Wang
2018-06-29  2:20       ` David Miller
2018-06-28 21:53   ` Cong Wang
2018-06-29  2:22     ` David Miller
2018-06-30  0:15       ` Cong Wang
2018-06-28 13:21 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb07f943-9033-8da6-7958-8ecdc94d25ac@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=fbl@redhat.com \
    --cc=fw@strlen.de \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox