From: David Miller <davem@davemloft.net>
To: eric.dumazet@gmail.com
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH net-next] tcp: give prequeue mode some care
Date: Thu, 28 Apr 2016 17:15:14 -0400 (EDT) [thread overview]
Message-ID: <20160428.171514.1303373912379094235.davem@davemloft.net> (raw)
In-Reply-To: <1461777145.5535.77.camel@edumazet-glaptop3.roam.corp.google.com>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 Apr 2016 10:12:25 -0700
> From: Eric Dumazet <edumazet@google.com>
>
> TCP prequeue goal is to defer processing of incoming packets
> to user space thread currently blocked in a recvmsg() system call.
>
> Intent is to spend less time processing these packets on behalf
> of softirq handler, as softirq handler is unfair to normal process
> scheduler decisions, as it might interrupt threads that do not
> even use networking.
>
> Current prequeue implementation has following issues :
>
> 1) It only checks size of the prequeue against sk_rcvbuf
>
> It was fine 15 years ago when sk_rcvbuf was in the 64KB vicinity.
> But we now have ~8MB values to cope with modern networking needs.
> We have to add sk_rmem_alloc in the equation, since out of order
> packets can definitely use up to sk_rcvbuf memory themselves.
>
> 2) Even with a fixed memory truesize check, prequeue can be filled
> by thousands of packets. When prequeue needs to be flushed, either
> from sofirq context (in tcp_prequeue() or timer code), or process
> context (in tcp_prequeue_process()), this adds a latency spike
> which is often not desirable.
> I added a fixed limit of 32 packets, as this translated to a max
> flush time of 60 us on my test hosts.
>
> Also note that all packets in prequeue are not accounted for tcp_mem,
> since they are not charged against sk_forward_alloc at this point.
> This is probably not a big deal.
>
> Note that this might increase LINUX_MIB_TCPPREQUEUEDROPPED counts,
> which is misnamed, as packets are not dropped at all, but rather pushed
> to the stack (where they can be either consumed or dropped)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
There was a conflict due to the stats macro renaming, but that was trivial
to resolve so I did it.
Applied, thanks Eric.
next prev parent reply other threads:[~2016-04-28 21:15 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-27 17:12 [PATCH net-next] tcp: give prequeue mode some care Eric Dumazet
2016-04-28 21:15 ` David Miller [this message]
2016-04-28 22:00 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160428.171514.1303373912379094235.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).