All of lore.kernel.org
 help / color / mirror / Atom feed
* Unsafe TCP connection close handling
@ 2011-12-08 20:58 Josh Durgin
  0 siblings, 0 replies; only message in thread
From: Josh Durgin @ 2011-12-08 20:58 UTC (permalink / raw)
  To: ceph-devel

Yesterday during an ffsb run on the ceph kernel client, both
the client and the osd processes hit the max open fds limit
(there was only one osd up at the time). There were 1006 sockets
in the CLOSING state on the client, and 1006 in the FIN_WAIT2
state on the osd.

 From the tcp state machine [1], it seems that the
sequence of events was something like this, with both sides
initially in the ESTABLISHED state:

       Kernel Client            OSD
            |                    |
            |                   /| Send FIN, go to FIN_WAIT1
Send FIN,  |\                 / |
go to      | \               /  |
FIN_WAIT1  |  \             /   |
            |   \           /    |
Recv FIN   |<--------------     |
            |     \              |
Send ACK,  |------\------------>| Recv ACK, go to FIN_WAIT2
go to      |       \            |
CLOSING    |        -----------x| FIN not read

That is, after closing its half of the connection, the osd isn't
reading anything from the socket anymore, and thus ignores the
FIN from the client. We have bug #1803 to track this, but we
should make sure libceph in the kernel handles simultaneous
TCP connection close correctly as well.

[1] http://www.tcpipguide.com/free/diagrams/tcpfsm.png

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2011-12-08 20:58 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-08 20:58 Unsafe TCP connection close handling Josh Durgin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.