netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* TCP stops sending packets over loopback on 4.10-rc3?
@ 2017-01-24 11:20 Josef Bacik
  2017-01-24 14:07 ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Josef Bacik @ 2017-01-24 11:20 UTC (permalink / raw)
  To: Linux Netdev List; +Cc: Lawrence Brakmo, Kernel Team

Hello,

I've been trying to test some NBD changes I had made recently and I 
started having packet timeouts.  I traced this down to tcp just 
stopping sending packets after a lot of writing.  All NBD does is call 
kernel_sendmsg() with a request struct and some pages when it does 
writes.  I did a bunch of tracing and I've narrowed it down to running 
out of sk_wmem_queued space.  In tcp_sendmsg() here

new_segment:
                        /* Allocate new segment. If the interface is SG,
                         * allocate skb fitting to single page.
                         */
                        if (!sk_stream_memory_free(sk))
                                goto wait_for_sndbuf;

we hit this pretty regularly, and eventually just get stuck in 
sk_stream_wait_memory until the timeout ends and we error out 
everything.  Now sk_stream_memory_free checks the sk_wmem_queued and 
calls into the sk_prot->stream_memory_free(), so I broke this out like 
the following


    if (sk->sk_wmem_queued >= sk->sk_sndbuf) {
        trace_printk("sk_wmem_queued %d, sk_sndbuf %d\n", 
sk->sk_wmem_queued, sk->sk_sndbuf);
        goto wait_for_sndbuf;
     }
     if (sk->sk_prot->stream_memory_free && 
!sk->sk_prot->stream_memory_free(sk)) {
        trace_printk("sk_stream_memory_free\n");
        goto wait_for_sndbuf;
     }

And I got this in my tracing

   kworker/u16:5-112   [001] ....  1375.637564: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [001] ....  1375.639657: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [003] ....  1375.641128: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [003] ....  1375.643441: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [001] ....  1375.807614: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [001] ....  1377.538744: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [001] ....  1377.543418: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
    kworker/2:4H-1535  [002] ....  1377.544685: tcp_sendmsg: 
sk_wmem_queued 4204872, sk_sndbuf 4194304
   kworker/u16:5-112   [000] ....  1379.378352: tcp_sendmsg: 
sk_wmem_queued 4205796, sk_sndbuf 4194304
   kworker/u16:5-112   [003] ....  1380.985721: tcp_sendmsg: 
sk_wmem_queued 4212416, sk_sndbuf 4194304

This is as far as I've gotten and I'll keep digging into it, but I was 
wondering if this looks familiar to anybody?  Also one thing I've 
noticed is sk_stream_wait_memory() will wait on sk_sleep(sk), but 
basically nothing wakes this up.  For example it seems the main way we 
reduce sk_wmem_queued is through sk_wmem_free_skb(), which doesn't 
appear to wake anything up in any of its callers, so anybody who does 
end up sleeping will basically never wake up.  That seems like it 
should be more broken than it is, so I'm curious to know how things are 
actually woken up in this case.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-01-26 22:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-24 11:20 TCP stops sending packets over loopback on 4.10-rc3? Josef Bacik
2017-01-24 14:07 ` Eric Dumazet
2017-01-25 14:14   ` Josef Bacik
2017-01-25 14:25     ` Eric Dumazet
2017-01-25 14:26     ` Josef Bacik
2017-01-25 14:39       ` Eric Dumazet
2017-01-26 21:59         ` Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).