public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Baron <jbaron@akamai.com>
To: Jason Baron <jbaron@akamai.com>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org
Subject: Re: [PATCH net-next v2] tcp: reduce cpu usage under tcp memory pressure when SO_SNDBUF is set
Date: Fri, 21 Aug 2015 16:55:30 -0400	[thread overview]
Message-ID: <55D79042.1050706@akamai.com> (raw)
In-Reply-To: <55CA37F5.8090108@akamai.com>



On 08/11/2015 01:59 PM, Jason Baron wrote:
> 
> 
> On 08/11/2015 12:12 PM, Eric Dumazet wrote:
>> On Tue, 2015-08-11 at 11:03 -0400, Jason Baron wrote:
>>
>>>
>>> Yes, so the test case I'm using to test against is somewhat contrived.
>>> In that I am simply allocating around 40,000 sockets that are idle to
>>> create a 'permanent' memory pressure in the background. Then, I have
>>> just 1 flow that sets SO_SNDBUF, which results in the: poll(), write() loop.
>>>
>>> That said, we encountered this issue initially where we had 10,000+
>>> flows and whenever the system would get into memory pressure, we would
>>> see all the cpus spin at 100%.
>>>
>>> So the testcase I wrote, was just a simplistic version for testing. But
>>> I am going to try and test against the more realistic workload where
>>> this issue was initially observed.
>>>
>>
>> Note that I am still trying to understand why we need to increase socket
>> structure, for something which is inherently a problem of sharing memory
>> with an unknown (potentially big) number of sockets.
>>
> 
> I was trying to mirror the wakeups when SO_SNDBUF is not set, where we
> continue to trigger on 1/3 of the buffer being available, as the
> sk->sndbuf is shrunk. And I saw this value as dynamic depending on
> number of sockets and read/write buffer usage. So that's where I was
> coming from with it.
> 
> Also, at least with the .config I have the tcp_sock structure didn't
> increase in size (although struct sock did go up by 8 and not 4).
> 
>> I suggested to use a flag (one bit).
>>
>> If set, then we should fallback to tcp_wmem[0] (each socket has 4096
>> bytes, so that we can avoid starvation)
>>
>>
>>
> 
> Ok, I will test this approach.

Hi Eric,

So I created a test here with 20,000 streams, and if I set SO_SNDBUF
high enough on the server side, I can create tcp memory pressure above
tcp_mem[2]. In this case, with the 'one bit' approach using tcp_wmem[0]
as the wakeup threshold I can still observe the 100% cpu spinning issue,
but with this v2 patch, cpu usage is minimal (1-2%). Since, we don't
guarantee tcp_wmem[0], above tcp_mem[2]. So using the 'one bit'
definitely alleviates the spinning between tcp_mem[1] and tcp_mem[2],
but not above tcp_mem[2] in my testing.

Maybe nobody cares about this case (you are getting what you ask for by
using SO_SNDBUF), but it seems to me that it would be nice to avoid this
sort of behavior. I also like the fact that with the
sk_effective_sndbuf, we keep doing wakeups on 1/3 of the write buffer
emptying, which keeps the wakeup behavior consistent. In theory this
would matter for high latency and bandwidth link, but in the testing I
did, I didn't observe any throughput differences between this v2 patch,
and the 'one bit' approach.

As I mentioned with this v2, the 'struct sock' grows by 4 bytes, but
struct tcp_sock does not increase. So since this is tcp specific, we
could add the sk_effective_sndbuf only to the struct tcp_sock.

So the 'one bit' approach definitely seems to me to be an improvement,
but I wanted to get feedback on this testing, before deciding how to
proceed.

Thanks,

-Jason

      reply	other threads:[~2015-08-21 20:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-11 14:38 [PATCH net-next v2] tcp: reduce cpu usage under tcp memory pressure when SO_SNDBUF is set Jason Baron
2015-08-11 14:49 ` Eric Dumazet
2015-08-11 15:03   ` Jason Baron
2015-08-11 16:12     ` Eric Dumazet
2015-08-11 17:59       ` Jason Baron
2015-08-21 20:55         ` Jason Baron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55D79042.1050706@akamai.com \
    --to=jbaron@akamai.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox