All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Patrick McManus <mcmanus@ducksong.com>
Cc: "Ingo Molnar" <mingo@elte.hu>,
	"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>,
	"Peter Zijlstra" <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Netdev <netdev@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Evgeniy Polyakov" <johnpol@2ka.mipt.ru>
Subject: Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+
Date: Sun, 01 Jun 2008 08:04:58 +0200	[thread overview]
Message-ID: <48423C0A.1090104@cosmosbay.com> (raw)
In-Reply-To: <1212273974.28319.107.camel@tng>

Patrick McManus a écrit :
> On Sat, 2008-05-31 at 18:35 +0200, Ingo Molnar wrote:
>   
>> * Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
>>
>>     
>
>   
>>> ...setsockopt(listenfd, SOL_TCP, TCP_DEFER_ACCEPT, &val, sizeof(val)) 
>>> seems to be the magic trick that is interestion here.
>>>       
>> seems to be used:
>>
>>  22003 write(3, "distccd[22003] (dcc_listen_by_ad"..., 62) = 62
>>  22003 listen(4, 10)                     = 0
>>  22003 setsockopt(4, SOL_TCP, TCP_DEFER_ACCEPT, [1], 4) = 0
>>
>> i'll queue up your reverts for testing in -tip.
>>     
>
>
> So the code you will revert came from my fingers. The circumstances here
> make me nervous; while I'm at a loss to explain what might be going on
> in particular, let me offer an apology in advance should the revert help
> resolve the issue.
>
> Here's what makes me nervous:
>
>  * not a lot of code uses DEFER_ACCEPT.. frankly it was pretty broken
> before 26 - but not broken this way .. the correlation of your bug using
> it is significant. 
>
>  * in 26, a server TCP socket (with DA) goes to ESTABLISHED when the 3rd
> part of the handshake is received (as normal without DA), but the socket
> isn't put on the accept queue until a real data packet arrives. (That's
> the point of DA). In <= 25 this socket would have syn-recv until the
> data packet arrived.
>
>   - I did run tests where the server died in between the handshake being
> completed and first data packet arriving - the client should see RST and
> the server socket should disappear. But maybe something was missed?
>
> Do I understand this correctly, the server process is gone but the
> socket is still in the table? And the client process is still there
> waiting for the server to do something - having sent a bunch of data?
>
> Do we know if any data bytes (not handshake bytes) have been consumed by
> the server side? If they were, that would seem to vindicate DA.
>
> Also pointing away from DA is that you started seeing this with rc3 -
> that code was included in rc1.Is that a firm observation, or maybe there
> weren't enough datapoints to conclude that rc1 and rc2 were clean?
>
> The most interesting patch is ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
> if anyone wants to eyeball it.
>
>
>
>   

I believe Ingo problems come on long lived sockets (were many bytes were 
exchanged between the peers), so I dont think DEFER_ACCEPT is the cullprit.

I suggest to enable CONFIG_TIMER_STATS and to check timers, because 
/proc/net/tcp
can display apparently large timer values when the timer is elapsed 
(jiffies > icsk->icsk_timeout)
and jiffies_to_clock_t(timer_expires - jiffies) is then overflowing 
doing a multiply and a divide.

On a 64bits server running linux-2.6.24-rc2, I can see *strange* timers 
values too in /proc/net/tcp, but not
stuck TCP sessions. On 64 bits, these strange values have 1AD7F

grep 1AD7F /proc/net/tcp | obfuscate_IP_and_ports

2017: local_peer remote_peer 03 00000000:00000000 01:1AD7F29ABBA 
00000001     0        0 0 2 ffff81067e7520c0
2019: local_peer remote_peer 03 00000000:00000000 01:1AD7F29ABBA 
00000003     0        0 0 2 ffff8106c580bcc0
2029: local_peer remote_peer 03 00000000:00000000 01:1AD7F29ABBA 
00000002     0        0 0 2 ffff81067313fe40
2032: local_peer remote_peer 03 00000000:00000000 01:1AD7F29ABBA 
00000003     0        0 0 2 ffff8106c716c340
2039: local_peer remote_peer 03 00000000:00000000 01:1AD7F29ABBA 
00000002     0        0 0 2 ffff8107d45b3f40
2041: local_peer remote_peer 03 00000000:00000000 01:1AD7F29AB37 
00000000     0        0 0 2 ffff810718e221c0
6610: local_peer remote_peer 01 00000000:00000000 00:1AD7F29ABCA 
00000000     0        0 136594789 1 ffff8107183fb940 94 10 16 2 -1
9925: local_peer remote_peer 01 00000000:00000000 00:1AD7F29ABCA 
00000000     0        0 144451161 1 ffff8107051a9840 351 10 0 2 -1

On TCP_SYN_RECV (03) sockets, timer can apparently be elapsed by many 
ticks, while on TCP_ESTABLISHED (01) one, I get
jiffies_to_clock(-1) -> 1AD7F29ABCA value because the way 
get_tcp4_sock() is coded (jiffies can change while running this function).
Note the 00: that means that no timer in my case.

Running again the command one second later gives completely different 
results (other sockets are displayed)

Maybe on 2.6.26-rc3+ we miss some timer correctness or we expose a 
latent NET bug.

void sk_reset_timer(struct sock *sk, struct timer_list* timer,
                    unsigned long expires)
{
        if (!mod_timer(timer, expires))
                sock_hold(sk);
}


Note that arming a timer also increase socket refcount and could explain 
why Ingo have sockets
apparently not owned by a process but still referenced (by a timer or 
many ones (I see refcnt=5) on following snapshots)

> sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt
> uid  timeout inode
> 21: 111111AC:0016 480111AC:E4E9 01 00000B50:00000000 01:7D1F8746 00000000
> 0        0 398713 5 f71a8580 205 40 1 36 -1
> 23: 111111AC:0016 480111AC:D359 01 000010F8:00000000 01:7D19A035 00000000
> 0        0 396426 5 f71a8a80 202 42 1 144 -1
> 25: 111111AC:0016 480111AC:8565 01 00000B50:00000000 01:7CEBA7D1 00000000
> 0        0 349113 5 eeeaf580 204 40 1 26 -1


Just my initial thoughts, sorry I currently cannot spend much time to 
diagnose the problem.






  parent reply	other threads:[~2008-06-01  6:05 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-26 11:56 [bug] stuck localhost TCP connections, v2.6.26-rc3+ Ingo Molnar
2008-05-26 13:28 ` Ilpo Järvinen
2008-05-26 13:59   ` Ingo Molnar
2008-05-26 14:12     ` Ingo Molnar
2008-05-26 14:17       ` Ingo Molnar
2008-05-26 14:29         ` Ingo Molnar
2008-05-26 14:43         ` Ilpo Järvinen
2008-05-26 14:58       ` Ilpo Järvinen
2008-05-26 16:23         ` Ingo Molnar
2008-05-26 16:32           ` Ilpo Järvinen
2008-05-26 16:54             ` Ingo Molnar
2008-05-26 17:08               ` Ilpo Järvinen
2008-05-26 18:12                 ` Ingo Molnar
2008-05-26 20:41                   ` Ingo Molnar
2008-05-26 21:20                     ` Ilpo Järvinen
2008-05-30 16:23   ` Ray Lee
2008-05-30 16:23     ` Ray Lee
2008-05-26 16:24 ` Arjan van de Ven
2008-05-28  9:27 ` Peter Zijlstra
2008-05-31 14:25   ` Håkon Løvdal
2008-05-31 14:25     ` Håkon Løvdal
2008-05-31 16:09     ` Ilpo Järvinen
2008-05-31 17:22       ` Ilpo Järvinen
2008-05-31 17:58       ` Håkon Løvdal
2008-05-31 18:37         ` Ilpo Järvinen
2008-05-31 20:25           ` Håkon Løvdal
2008-05-31 20:25             ` Håkon Løvdal
2008-05-31 21:39             ` Ilpo Järvinen
2008-05-31 21:45               ` Håkon Løvdal
2008-05-31 21:45                 ` Håkon Løvdal
2008-06-04  0:10               ` Håkon Løvdal
2008-06-04  0:10                 ` Håkon Løvdal
2008-06-04 11:14                 ` Ilpo Järvinen
2008-06-04 14:00                   ` Håkon Løvdal
2008-06-04 14:00                     ` Håkon Løvdal
2008-06-04 15:09                     ` Ilpo Järvinen
2008-06-06  9:32                       ` Håkon Løvdal
2008-06-06  9:32                         ` Håkon Løvdal
2008-06-09 19:24                         ` Ilpo Järvinen
2008-06-10 23:26                           ` Håkon Løvdal
2008-06-10 23:26                             ` Håkon Løvdal
2008-06-11 13:39                             ` Ilpo Järvinen
2008-06-19  0:30                               ` Håkon Løvdal
2008-06-19  0:30                                 ` Håkon Løvdal
2008-05-29  8:45 ` Ingo Molnar
2008-05-29 11:14   ` Ilpo Järvinen
2008-05-29 11:22     ` Ingo Molnar
2008-05-29 13:05       ` Evgeniy Polyakov
2008-05-29 13:43         ` Ingo Molnar
2008-05-29 13:08       ` Ingo Molnar
2008-05-29 13:48         ` Ilpo Järvinen
2008-05-30 11:09         ` Ingo Molnar
2008-05-30 21:12           ` Ilpo Järvinen
2008-05-30 18:18       ` Ingo Molnar
2008-05-31  6:09         ` Ingo Molnar
2008-05-31 11:46           ` Ilpo Järvinen
2008-05-31 12:18             ` Ilpo Järvinen
2008-05-31 12:54               ` Ingo Molnar
2008-05-31 12:58                 ` Ilpo Järvinen
2008-05-31 16:35                   ` Ingo Molnar
2008-05-31 22:46                     ` Patrick McManus
2008-06-01  5:51                       ` Ilpo Järvinen
2008-06-01  6:04                       ` Eric Dumazet [this message]
2008-06-02  9:23                         ` Ingo Molnar
2008-06-03  9:40                     ` [fixed] [patch] " Ingo Molnar
2008-06-03 14:41                       ` Patrick McManus
2008-06-03 21:46                       ` Ilpo Järvinen
2008-06-03 22:01                         ` Ilpo Järvinen
2008-06-03 22:03                           ` David Miller
2008-06-03 22:10                             ` Ilpo Järvinen
2008-06-03 23:22                             ` Ilpo Järvinen
2008-06-03 23:54                               ` Joe Perches
2008-06-04  6:25                                 ` Ilpo Järvinen
2008-06-04  2:54                               ` Patrick McManus
2008-06-04  6:42                                 ` Ilpo Järvinen
2008-06-05 14:22                               ` Ingo Molnar
2008-06-05 18:00                                 ` Ilpo Järvinen
2008-06-05 21:13                                   ` Ilpo Järvinen
2008-06-05 23:29                                     ` Patrick McManus
2008-06-06 10:03                                       ` Ilpo Järvinen
2008-06-06 17:11                                         ` Patrick McManus
2008-06-06 17:33                                           ` Ingo Molnar
2008-06-06 18:19                                             ` Ilpo Järvinen
2008-06-06 18:39                                               ` Ingo Molnar
2008-06-06 19:49                                                 ` Ilpo Järvinen
2008-06-06 20:08                                                 ` Patrick McManus
2008-06-06 21:12                                                   ` Ilpo Järvinen
2008-06-06 21:23                                                     ` Arjan van de Ven
2008-06-06 21:28                                                       ` Ilpo Järvinen
2008-06-10 22:49                                                   ` David Miller
2008-06-06 18:25                                           ` Ilpo Järvinen
2008-06-10 22:32                               ` David Miller
2008-06-11 13:10                                 ` Patrick McManus
2008-06-11 15:13                                 ` Ilpo Järvinen
2008-06-04  7:23                         ` Ingo Molnar
2008-06-04 18:24                           ` David Miller
2008-06-04 20:56                             ` Ilpo Järvinen
2008-06-04 21:55                               ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2008-05-30 16:31 Ray Lee
2008-05-30 21:11 ` Ilpo Järvinen
2008-05-31  6:03   ` Evgeniy Polyakov
2008-05-31 10:05     ` Ilpo Järvinen
2008-06-02  6:19       ` Herbert Xu
2008-06-02 11:53         ` Ilpo Järvinen
2008-06-02 14:08           ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48423C0A.1090104@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcmanus@ducksong.com \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.