From: Josef Bacik <jbacik@fb.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Tom Herbert <tom@herbertland.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Josef Bacik <jbacik@fb.com>
Subject: Re: Soft lockup in inet_put_port on 4.6
Date: Mon, 12 Dec 2016 17:24:26 -0500 [thread overview]
Message-ID: <1481581466.24490.2@smtp.office365.com> (raw)
In-Reply-To: <3c022731-e703-34ac-55f1-60f5b94b6d62@stressinduktion.org>
On Mon, Dec 12, 2016 at 1:44 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On 12.12.2016 19:05, Josef Bacik wrote:
>> On Fri, Dec 9, 2016 at 11:14 PM, Eric Dumazet
>> <eric.dumazet@gmail.com>
>> wrote:
>>> On Fri, 2016-12-09 at 19:47 -0800, Eric Dumazet wrote:
>>>
>>>>
>>>> Hmm... Is your ephemeral port range includes the port your load
>>>> balancing app is using ?
>>>
>>> I suspect that you might have processes doing bind( port = 0) that
>>> are
>>> trapped into the bind_conflict() scan ?
>>>
>>> With 100,000 + timewaits there, this possibly hurts.
>>>
>>> Can you try the following loop breaker ?
>>
>> It doesn't appear that the app is doing bind(port = 0) during normal
>> operation. I tested this patch and it made no difference. I'm
>> going to
>> test simply restarting the app without changing to the SO_REUSEPORT
>> option. Thanks,
>
> Would it be possible to trace the time the function uses with trace?
> If
> we don't see the number growing considerably over time we probably can
> rule out that we loop somewhere in there (I would instrument
> inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port).
>
> __inet_hash_connect -> __inet_check_established also takes a lock
> (inet_ehash_lockp) which can be locked from inet_diag code path during
> socket diag info dumping.
>
> Unfortunately we couldn't reproduce it so far. :/
So I had a bcc script running to time how long we spent in
inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port, but
of course I'm an idiot and didn't actually separate out the stats so I
could tell _which_ one was taking forever. But anyway here's a normal
distribution on the box
Some shit : count distribution
0 -> 1 : 0 |
|
2 -> 3 : 0 |
|
4 -> 7 : 0 |
|
8 -> 15 : 0 |
|
16 -> 31 : 0 |
|
32 -> 63 : 0 |
|
64 -> 127 : 0 |
|
128 -> 255 : 0 |
|
256 -> 511 : 0 |
|
512 -> 1023 : 0 |
|
1024 -> 2047 : 74 |
|
2048 -> 4095 : 10537
|****************************************|
4096 -> 8191 : 8497 |********************************
|
8192 -> 16383 : 3745 |**************
|
16384 -> 32767 : 300 |*
|
32768 -> 65535 : 250 |
|
65536 -> 131071 : 180 |
|
131072 -> 262143 : 71 |
|
262144 -> 524287 : 18 |
|
524288 -> 1048575 : 5 |
|
With the times in nanoseconds, and here's the distribution during the
problem
Some shit : count distribution
0 -> 1 : 0 |
|
2 -> 3 : 0 |
|
4 -> 7 : 0 |
|
8 -> 15 : 0 |
|
16 -> 31 : 0 |
|
32 -> 63 : 0 |
|
64 -> 127 : 0 |
|
128 -> 255 : 0 |
|
256 -> 511 : 0 |
|
512 -> 1023 : 0 |
|
1024 -> 2047 : 21 |
|
2048 -> 4095 : 21820
|****************************************|
4096 -> 8191 : 11598 |*********************
|
8192 -> 16383 : 4337 |*******
|
16384 -> 32767 : 290 |
|
32768 -> 65535 : 59 |
|
65536 -> 131071 : 23 |
|
131072 -> 262143 : 12 |
|
262144 -> 524287 : 6 |
|
524288 -> 1048575 : 19 |
|
1048576 -> 2097151 : 1079 |*
|
2097152 -> 4194303 : 0 |
|
4194304 -> 8388607 : 1 |
|
8388608 -> 16777215 : 0 |
|
16777216 -> 33554431 : 0 |
|
33554432 -> 67108863 : 1192 |**
|
Some shit : count distribution
0 -> 1 : 0 |
|
2 -> 3 : 0 |
|
4 -> 7 : 0 |
|
8 -> 15 : 0 |
|
16 -> 31 : 0 |
|
32 -> 63 : 0 |
|
64 -> 127 : 0 |
|
128 -> 255 : 0 |
|
256 -> 511 : 0 |
|
512 -> 1023 : 0 |
|
1024 -> 2047 : 48 |
|
2048 -> 4095 : 14714
|********************|
4096 -> 8191 : 6769 |*********
|
8192 -> 16383 : 2234 |***
|
16384 -> 32767 : 422 |
|
32768 -> 65535 : 208 |
|
65536 -> 131071 : 61 |
|
131072 -> 262143 : 10 |
|
262144 -> 524287 : 416 |
|
524288 -> 1048575 : 826 |*
|
1048576 -> 2097151 : 598 |
|
2097152 -> 4194303 : 10 |
|
4194304 -> 8388607 : 0 |
|
8388608 -> 16777215 : 1 |
|
16777216 -> 33554431 : 289 |
|
33554432 -> 67108863 : 921 |*
|
67108864 -> 134217727 : 74 |
|
134217728 -> 268435455 : 75 |
|
268435456 -> 536870911 : 48 |
|
536870912 -> 1073741823 : 25 |
|
1073741824 -> 2147483647 : 3 |
|
2147483648 -> 4294967295 : 2 |
|
4294967296 -> 8589934591 : 1 |
|
As you can see we start getting tail latencies of up to 4-8 seconds.
Tomorrow I'll separate out the stats so we can know which function is
the problem child. Sorry about not doing that first. Thanks,
Josef
next prev parent reply other threads:[~2016-12-12 22:24 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-06 23:06 Soft lockup in inet_put_port on 4.6 Tom Herbert
2016-12-08 21:03 ` Hannes Frederic Sowa
2016-12-08 21:36 ` Josef Bacik
2016-12-09 0:30 ` Eric Dumazet
2016-12-09 1:01 ` Josef Bacik
2016-12-10 1:59 ` Josef Bacik
2016-12-10 3:47 ` Eric Dumazet
2016-12-10 4:14 ` Eric Dumazet
2016-12-12 18:05 ` Josef Bacik
2016-12-12 18:44 ` Hannes Frederic Sowa
2016-12-12 21:23 ` Josef Bacik
2016-12-12 22:24 ` Josef Bacik [this message]
2016-12-13 20:51 ` Tom Herbert
2016-12-13 23:03 ` Craig Gallek
2016-12-13 23:32 ` Tom Herbert
2016-12-15 18:53 ` Josef Bacik
2016-12-15 22:39 ` Tom Herbert
2016-12-15 23:25 ` Craig Gallek
2016-12-16 0:07 ` Hannes Frederic Sowa
2016-12-16 14:54 ` Josef Bacik
2016-12-16 15:21 ` Josef Bacik
2016-12-16 22:08 ` Josef Bacik
2016-12-16 22:18 ` Tom Herbert
2016-12-16 22:50 ` Josef Bacik
2016-12-17 11:08 ` Hannes Frederic Sowa
2016-12-17 13:26 ` Josef Bacik
2016-12-20 1:56 ` David Miller
2016-12-20 2:07 ` Tom Herbert
2016-12-20 2:41 ` Eric Dumazet
2016-12-20 3:40 ` Josef Bacik
2016-12-20 4:52 ` Eric Dumazet
2016-12-20 4:59 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1481581466.24490.2@smtp.office365.com \
--to=jbacik@fb.com \
--cc=eric.dumazet@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=netdev@vger.kernel.org \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).