All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linux Netdev List <netdev@vger.kernel.org>,
	Lawrence Brakmo <brakmo@fb.com>, Kernel Team <kernel-team@fb.com>
Subject: Re: TCP stops sending packets over loopback on 4.10-rc3?
Date: Wed, 25 Jan 2017 09:26:48 -0500	[thread overview]
Message-ID: <1485354408.5902.3@smtp.office365.com> (raw)
In-Reply-To: <1485353664.5902.2@smtp.office365.com>

On Wed, Jan 25, 2017 at 9:14 AM, Josef Bacik <jbacik@fb.com> wrote:
> On Tue, Jan 24, 2017 at 9:07 AM, Eric Dumazet 
> <eric.dumazet@gmail.com> wrote:
>> On Tue, 2017-01-24 at 06:20 -0500, Josef Bacik wrote:
>>>  Hello,
>>> 
>>>  I've been trying to test some NBD changes I had made recently and I
>>>  started having packet timeouts.  I traced this down to tcp just
>>>  stopping sending packets after a lot of writing.  All NBD does is 
>>> call
>>>  kernel_sendmsg() with a request struct and some pages when it does
>>>  writes.  I did a bunch of tracing and I've narrowed it down to 
>>> running
>>>  out of sk_wmem_queued space.  In tcp_sendmsg() here
>>> 
>>>  new_segment:
>>>                          /* Allocate new segment. If the interface 
>>> is SG,
>>>                           * allocate skb fitting to single page.
>>>                           */
>>>                          if (!sk_stream_memory_free(sk))
>>>                                  goto wait_for_sndbuf;
>>> 
>>>  we hit this pretty regularly, and eventually just get stuck in
>>>  sk_stream_wait_memory until the timeout ends and we error out
>>>  everything.  Now sk_stream_memory_free checks the sk_wmem_queued 
>>> and
>>>  calls into the sk_prot->stream_memory_free(), so I broke this out 
>>> like
>>>  the following
>>> 
>>> 
>>>      if (sk->sk_wmem_queued >= sk->sk_sndbuf) {
>>>          trace_printk("sk_wmem_queued %d, sk_sndbuf %d\n",
>>>  sk->sk_wmem_queued, sk->sk_sndbuf);
>>>          goto wait_for_sndbuf;
>>>       }
>>>       if (sk->sk_prot->stream_memory_free &&
>>>  !sk->sk_prot->stream_memory_free(sk)) {
>>>          trace_printk("sk_stream_memory_free\n");
>>>          goto wait_for_sndbuf;
>>>       }
>>> 
>>>  And I got this in my tracing
>>> 
>>>     kworker/u16:5-112   [001] ....  1375.637564: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [001] ....  1375.639657: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [003] ....  1375.641128: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [003] ....  1375.643441: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [001] ....  1375.807614: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [001] ....  1377.538744: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [001] ....  1377.543418: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>      kworker/2:4H-1535  [002] ....  1377.544685: tcp_sendmsg:
>>>  sk_wmem_queued 4204872, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [000] ....  1379.378352: tcp_sendmsg:
>>>  sk_wmem_queued 4205796, sk_sndbuf 4194304
>>>     kworker/u16:5-112   [003] ....  1380.985721: tcp_sendmsg:
>>>  sk_wmem_queued 4212416, sk_sndbuf 4194304
>>> 
>>>  This is as far as I've gotten and I'll keep digging into it, but I 
>>> was
>>>  wondering if this looks familiar to anybody?  Also one thing I've
>>>  noticed is sk_stream_wait_memory() will wait on sk_sleep(sk), but
>>>  basically nothing wakes this up.  For example it seems the main 
>>> way we
>>>  reduce sk_wmem_queued is through sk_wmem_free_skb(), which doesn't
>>>  appear to wake anything up in any of its callers, so anybody who 
>>> does
>>>  end up sleeping will basically never wake up.  That seems like it
>>>  should be more broken than it is, so I'm curious to know how 
>>> things are
>>>  actually woken up in this case.  Thanks,
>> 
>> 
>> git grep -n SOCK_QUEUE_SHRUNK
>> 
>> -> tcp_check_space()
> 
> But tcp_check_space() doesn't actually reduce sk_wmem_queued from 
> what I can see.  The only places that appear to reduce it are 
> tcp_trim_head, which is only called in the retransmit path, and 
> sk_wmem_free_skb, which seems to be right, but I added a 
> trace_printk() in it to see if it was firing during my test and it 
> never fires.  So we _appear_ to only ever be incrementing this 
> counter, but never decrementing it.  I'm doing a bunch of tracing 
> trying to figure out what is going on here but so far nothing is 
> popping which is starting to make me think ftrace is broken.  Thanks,

Nope ftrace isn't broken, I'm just dumb, the space is being reclaimed 
by sk_wmem_free_skb().  So I guess I need to figure out why I stop 
getting ACK's from the other side of the loopback.  Thanks,

Josef

  parent reply	other threads:[~2017-01-25 14:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-24 11:20 TCP stops sending packets over loopback on 4.10-rc3? Josef Bacik
2017-01-24 14:07 ` Eric Dumazet
2017-01-25 14:14   ` Josef Bacik
2017-01-25 14:25     ` Eric Dumazet
2017-01-25 14:26     ` Josef Bacik [this message]
2017-01-25 14:39       ` Eric Dumazet
2017-01-26 21:59         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1485354408.5902.3@smtp.office365.com \
    --to=jbacik@fb.com \
    --cc=brakmo@fb.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.