* NFS Client causing SYN Flood
@ 2003-09-04 22:51 Paul Stewart
2003-09-04 23:01 ` Trond Myklebust
2003-09-04 23:24 ` Luciano Miguel Ferreira Rocha
0 siblings, 2 replies; 4+ messages in thread
From: Paul Stewart @ 2003-09-04 22:51 UTC (permalink / raw)
To: linux-fsdevel
We've been running into a problem where our RedHat 9 boxes are taking
down our NFS server with SYN packets. The NFS server runs FreeBSD and
is attached via 100Gbit ethernet, and the clients are RedHat 9, running
the stock 2.4.20-8 uniprocessor kernel. They use the "rw,vers=3,tcp"
flags and use autofs/automount to initiate mounts to the server.
We found the problem for the first time when we had a network partition
that disconnected the clients from the server. When the connection
recovered, the FreeBSD machine crashed, and then wouldn't come up all
the way before dying again. We found that as the server was coming up,
it was inundated with TCP SYN packets from a few of the RedHat clients.
This proceeded until the FreeBSD server keeled over with the load. The
packets were sourced from port 800 (hard coded in net/sunrpc/xprt.c as
its source port) and sent to port 2049 (nfs server port). The SYN
packet from each host tedned to come roughly 0.000141 seconds after
receiving a RST frame from the previous SYN. SYN to SYN periods were
on the order of 0.000158 seconds.
Our current solution isn't quite satisfactory. If we move the NFS
server over to 100Mbit Ethernet, the connection is slow enough so the
load of SYN/RST turnarounds isn't enough for the FreeBSD box to keel
over.
It appears to me that there may be an instance where the timeouts
aren't working correctly in net/sunrpc/xprt.c, which causes this quick
turnaround SYNs under certain conditions. Alternately, the code in
xprt_reconnect() for performing asynchronous TCP connect() may not be
setup well for receiving a RST in that short order (~0.000016 seconds).
I'm not an expert on the sunrpc code, although now that I'm feeling the
pinch, I may get a bit more knowledgable. I have a pcap trace of the
gigabit link while RedHat boxes were wailing on the server, if anyone
wants to see it (I'll put an excerpt below). The TCP sequence number
in the SYN frames are changing, so I don't think this is a problem with
frames replicating inside the network or in the FreeBSD stack. It
doesn't seem like RedHat changed net/sunrpc/xprt.c or fs/nfs/ much so
I'm assuming this is a vanilla kernel issue as well.
Does anyone have insight into this?
--
Paul
No. Time Source Destination Protocol
20 1.627623 13.2.18.28 13.2.18.88 TCP
800 > nfs [SYN] Seq=3274288328 Ack=0 Win=5840 Len=0
21 1.627644 13.2.18.88 13.2.18.28 TCP
nfs > 800 [RST, ACK] Seq=0 Ack=3274288329 Win=0 Len=0
22 1.627842 13.2.18.28 13.2.18.88 TCP
800 > nfs [SYN] Seq=3277282606 Ack=0 Win=5840 Len=0
23 1.627860 13.2.18.88 13.2.18.28 TCP
nfs > 800 [RST, ACK] Seq=0 Ack=3277282607 Win=0 Len=0
24 1.628003 13.2.18.28 13.2.18.88 TCP
800 > nfs [SYN] Seq=3277282775 Ack=0 Win=5840 Len=0
25 1.628020 13.2.18.88 13.2.18.28 TCP
nfs > 800 [RST, ACK] Seq=0 Ack=3277282776 Win=0 Len=0
26 1.628163 13.2.18.28 13.2.18.88 TCP
800 > nfs [SYN] Seq=3277282934 Ack=0 Win=5840 Len=0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NFS Client causing SYN Flood
2003-09-04 22:51 NFS Client causing SYN Flood Paul Stewart
@ 2003-09-04 23:01 ` Trond Myklebust
2003-09-05 13:27 ` Steve Dickson
2003-09-04 23:24 ` Luciano Miguel Ferreira Rocha
1 sibling, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2003-09-04 23:01 UTC (permalink / raw)
To: Paul Stewart; +Cc: linux-fsdevel
The RedHat 9 kernel does not appear to be up to sync w.r.t. the TCP
reconnect code in the latest official kernels. The latter will delay
before attempting to reconnect.
Could you therefore retry using a stock Linux 2.4.22?
Cheers,
Trond
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NFS Client causing SYN Flood
2003-09-04 22:51 NFS Client causing SYN Flood Paul Stewart
2003-09-04 23:01 ` Trond Myklebust
@ 2003-09-04 23:24 ` Luciano Miguel Ferreira Rocha
1 sibling, 0 replies; 4+ messages in thread
From: Luciano Miguel Ferreira Rocha @ 2003-09-04 23:24 UTC (permalink / raw)
To: Paul Stewart; +Cc: linux-fsdevel
It seems a kernel bug, someone more knowledgeable than me will have to
speak about that.
I can only suggest the use of the option timeo in the clients and/or
firewall in the server/clients limiting the number of new connections
per second.
Regards,
Luciano Rocha
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NFS Client causing SYN Flood
2003-09-04 23:01 ` Trond Myklebust
@ 2003-09-05 13:27 ` Steve Dickson
0 siblings, 0 replies; 4+ messages in thread
From: Steve Dickson @ 2003-09-05 13:27 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Paul Stewart, linux-fsdevel
Trond Myklebust wrote:
>The RedHat 9 kernel does not appear to be up to sync w.r.t. the TCP
>reconnect code in the latest official kernels. The latter will delay
>before attempting to reconnect.
>
This problem will be fixed in upcoming releases...
SteveD.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-09-05 13:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-04 22:51 NFS Client causing SYN Flood Paul Stewart
2003-09-04 23:01 ` Trond Myklebust
2003-09-05 13:27 ` Steve Dickson
2003-09-04 23:24 ` Luciano Miguel Ferreira Rocha
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox