* conntrack and RSTs received during CLOSE_WAIT
@ 2009-05-15 22:10 Robert L Mathews
2009-05-16 21:57 ` Jozsef Kadlecsik
0 siblings, 1 reply; 22+ messages in thread
From: Robert L Mathews @ 2009-05-15 22:10 UTC (permalink / raw)
To: netfilter
I'm using Linux kernel 2.6.26 with conntrack/connlimit to prevent people
from DOSing our Web servers by opening up too many simultaneous
connections from one IP address. This is mostly for protection against
unintentional DOSes from broken proxy servers that try to open up
literally hundreds of simultaneous connections; we DROP their syn
packets if they already have 40 connections open.
This is generally working well (and thanks to folks on this list for the
hard work that makes this possible).
However: Some clients send evil TCP RSTs that confuse conntrack and
break connlimit in a way that I'll detail below. First, here's a sample
recreation:
client > server [SYN] Seq=0 Len=0
server > client [SYN,ACK] Seq=0 Ack=1 Len=0
client > server [ACK] Seq=1 Ack=1 Len=0
client > server [PSH,ACK] Seq=1 Ack=1 Len=420 (HTTP GET request)
server > client [ACK] Seq=1 Ack=421 Len=0
server > client [ACK] Seq=1 Ack=421 Len=1448 (HTTP response)
server > client [ACK] Seq=1449 Ack=421 Len=1448 (more HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (more HTTP response)
client > server [FIN,ACK] Seq=421 Ack=1449 Len=0
server > client [ACK] Seq=4345 Ack=422 Len=1448 (more HTTP response)
server > client [ACK] Seq=5793 Ack=422 Len=1448 (more HTTP response)
client > server [RST] Seq=421 Len=0
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
Everything up to and including the "RST" takes place in under a tenth of
a second. The remaining ten retransmits take place over 5 minutes.
As soon as the client received the first packet of the HTTP response, it
decided to close the connection. This appears to be due to a SonicWall
firewall on the client end, which examines the Content-Type of the HTTP
reply and immediately shuts down the connection if it's a "forbidden"
type. This is apparently common.
From the server's TCP stack point of view, this connection enters the
CLOSE_WAIT state when the FIN is received. The stack then waits for
Apache to close() the socket. However, Apache doesn't close the socket
for five minutes. That's because it's blocked waiting for a socket write
to complete, and it doesn't notice the end-of-input on the socket until
the write times out. (Yes, according to netstat, the connection remains
in CLOSE_WAIT even after the RST packet, which surprised me, but that's
how Linux works, apparently.)
If the client opens up hundreds of these connections within five
minutes, it can use up hundreds of Apache process slots. I want
connlimit to prevent that, and it looks like it should, because
conntrack should be tracking the CLOSE_WAIT connections just like any
other connections. To make sure it tracks them long enough, I've set
ip_conntrack_tcp_timeout_close_wait to 5 minutes.
However, the RST packet screws things up. As I said, the kernel ignores
the RST packet and leaves the connection in CLOSE_WAIT. But when
conntrack sees the RST packet, it marks the connection CLOSEd, and then
forgets about it 10 seconds later.
What happens next depends on whether nf_conntrack_tcp_loose is set. If
it's set to 1, the server's retransmitted packets cause a new, "fake"
connection to be ESTABLISHED in conntrack, which lingers for five
days(!). We originally had it set that way, but a couple of legitimate
customers were complaining about still being blocked from our servers
for five days after they'd actually closed all their connections.
So we set nf_conntrack_tcp_loose to 0. That solved the "blocked for five
days" problem.... but now the CLOSE_WAIT connections quickly go to CLOSE
in conntrack when the RST arrives and are totally forgotten ten seconds
later. A rogue client can quickly get 40 connections into the CLOSE_WAIT
state, then wait ten seconds and open 40 more, etc., occupying up to
1200 Apache process slots within five minutes.
What we really want is for conntrack to match what the kernel does: to
ignore the RST packet for CLOSE_WAIT connections, leaving the connection
to remain in the conntrack CLOSE_WAIT state until
ip_conntrack_tcp_timeout_close_wait expires. That looks easy to do with
a change to nf_conntrack_proto_tcp.c:
-/*rst*/ { sIV, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sIV },
+/*rst*/ { sIV, sCL, sCL, sCL, sCL, sCW, sCL, sCL, sCL, sIV },
... but I'd rather not maintain a custom compiled kernel just for that.
So I've considered other solutions:
1. Set nf_conntrack_tcp_loose to 1, but change
ip_conntrack_tcp_timeout_established to 1 hour (instead of 5 days). This
would make sure that people aren't blocked for more than an hour after
they close all their connections. However, that's still not ideal -- and
it would also allow someone to intentionally bypass connlimit by opening
40 connections, then leaving them idle for an hour, then opening 40
more, and so on.
2. Set nf_conntrack_tcp_loose to 0, and change
nf_conntrack_tcp_timeout_close to 5 minutes (instead of 10 seconds).
This would only block people for the 5 minutes that they're still taking
up an Apache process slot, but would also block anyone who sends 40 TCP
RSTs within 5 minutes for any reason. You wouldn't think that this would
be a problem, but RSTs actually seem quite common on a busy Web server
with a fairly low HTTP keepalive value.
Does anyone have any other suggestions about how to make conntrack
remember these connections during (and only during) the five-minute
period netstat shows them as CLOSE_WAIT?
--
Robert L Mathews, Tiger Technologies http://www.tigertech.net/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-15 22:10 conntrack and RSTs received during CLOSE_WAIT Robert L Mathews
@ 2009-05-16 21:57 ` Jozsef Kadlecsik
2009-05-17 3:09 ` Robert L Mathews
0 siblings, 1 reply; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-16 21:57 UTC (permalink / raw)
To: Robert L Mathews; +Cc: netfilter
Hi,
On Fri, 15 May 2009, Robert L Mathews wrote:
> I'm using Linux kernel 2.6.26 with conntrack/connlimit to prevent people from
> DOSing our Web servers by opening up too many simultaneous connections from
> one IP address. This is mostly for protection against unintentional DOSes from
> broken proxy servers that try to open up literally hundreds of simultaneous
> connections; we DROP their syn packets if they already have 40 connections
> open.
>
> This is generally working well (and thanks to folks on this list for the hard
> work that makes this possible).
>
> However: Some clients send evil TCP RSTs that confuse conntrack and break
> connlimit in a way that I'll detail below. First, here's a sample recreation:
>
> client > server [SYN] Seq=0 Len=0
> server > client [SYN,ACK] Seq=0 Ack=1 Len=0
> client > server [ACK] Seq=1 Ack=1 Len=0
> client > server [PSH,ACK] Seq=1 Ack=1 Len=420 (HTTP GET request)
> server > client [ACK] Seq=1 Ack=421 Len=0
> server > client [ACK] Seq=1 Ack=421 Len=1448 (HTTP response)
> server > client [ACK] Seq=1449 Ack=421 Len=1448 (more HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (more HTTP response)
> client > server [FIN,ACK] Seq=421 Ack=1449 Len=0
> server > client [ACK] Seq=4345 Ack=422 Len=1448 (more HTTP response)
> server > client [ACK] Seq=5793 Ack=422 Len=1448 (more HTTP response)
> client > server [RST] Seq=421 Len=0
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
> server > client [ACK] Seq=2897 Ack=421 Len=1448 (retr HTTP response)
>
> Everything up to and including the "RST" takes place in under a tenth of a
> second. The remaining ten retransmits take place over 5 minutes.
The TCP session seems to be totally broken. After the client sends
client > server [FIN,ACK] Seq=421 Ack=1449 Len=0
it should send the RST packet with Seq=422 and not Seq=421. The RST
segment won't be accepted by the server.
And I don't get the server either: after sending Ack=422 it can't send
Ack=421. Or there is an active device between the firewall and the server
which reorders the packets.
Is it a real TCP session recording or a mistyped one?
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-16 21:57 ` Jozsef Kadlecsik
@ 2009-05-17 3:09 ` Robert L Mathews
2009-05-20 5:16 ` Robert L Mathews
0 siblings, 1 reply; 22+ messages in thread
From: Robert L Mathews @ 2009-05-17 3:09 UTC (permalink / raw)
To: netfilter
Jozsef Kadlecsik wrote:
> The TCP session seems to be totally broken. After the client sends
>
> client > server [FIN,ACK] Seq=421 Ack=1449 Len=0
>
> it should send the RST packet with Seq=422 and not Seq=421. The RST
> segment won't be accepted by the server.
Okay. The client is definitely sending exactly that (I'm pretty sure
it's a SonicWall firewall). That explains why the connection stays in
the CLOSE_WAIT state according to netstat.
So the problem can be described as:
Some buggy clients send an out-of-sequence RST. When that happens,
conntrack forgets about the connection ten seconds later, even though
the TCP stack doesn't.
If nf_conntrack_tcp_loose is set to 0, this gives clients a trivial way
to bypass connlimit, because the client then has open connections that
aren't counted.
If nf_conntrack_tcp_loose is set to 1, subsequent packets sent more than
ten seconds later will result in conntrack seeing a new ESTABLISHED
connection. Unfortunately, if the subsequent packets were merely TCP
retransmits (which is likely), the "new connection" will not really
exist. Connlimit counts a nonexistent connection as being open for five
days until it times out.
Both of these outcomes are obviously undesirable. Any suggestions how to
avoid this, or to minimize the impact?
> And I don't get the server either: after sending Ack=422 it can't send
> Ack=421.
>
> Is it a real TCP session recording or a mistyped one?
You're right; that was a typo on my part, for which I apologize. I had
to retype it from Wireshark, and I copied the wrong line. The ten
retransmitted packets at the end do indeed send Ack=422, just as you say
they should.
(However, the client problem is not a typo. The client definitely did
send Seq=421 in the RST, which explains why netstat shows the connection
remaining in CLOSE_WAIT and why the server continues to retransmit packets.)
--
Robert L Mathews, Tiger Technologies http://www.tigertech.net/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-17 3:09 ` Robert L Mathews
@ 2009-05-20 5:16 ` Robert L Mathews
2009-05-20 7:19 ` Jozsef Kadlecsik
0 siblings, 1 reply; 22+ messages in thread
From: Robert L Mathews @ 2009-05-20 5:16 UTC (permalink / raw)
To: netfilter
I haven't received any more followups on this, so I'll try to hack some
solution around the problem.
But I will mention that I'm surprised that this didn't generate more
discussion. Unless I'm confused (which is possible), sending
out-of-sequence RST packets appears to be a trivial way to bypass connlimit.
It seems that all an attacker needs to do is send invalid RST packets
with a sequence number one less than the last ACK received from the
server. Then conntrack will forget about the connection, allowing the
attacker to open as many connections as desired, regardless of connlimit
limits.
I wrote a little perl script that I can leave running in the background
on the client to send the necessary RST packets. In my testing, it does
allow me to bypass connlimit restrictions on a server:
http://www.tigertech.net/patches/rawip.pl
This seems to make connlimit less useful than I'd previously believed.
Am I just misunderstanding something?
--
Robert L Mathews, Tiger Technologies http://www.tigertech.net/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 5:16 ` Robert L Mathews
@ 2009-05-20 7:19 ` Jozsef Kadlecsik
2009-05-20 7:31 ` Philip Craig
2009-05-20 20:24 ` Robert L Mathews
0 siblings, 2 replies; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-20 7:19 UTC (permalink / raw)
To: Robert L Mathews; +Cc: netfilter
On Tue, 19 May 2009, Robert L Mathews wrote:
> But I will mention that I'm surprised that this didn't generate more
> discussion. Unless I'm confused (which is possible), sending out-of-sequence
> RST packets appears to be a trivial way to bypass connlimit.
>
> It seems that all an attacker needs to do is send invalid RST packets with a
> sequence number one less than the last ACK received from the server. Then
> conntrack will forget about the connection, allowing the attacker to open as
> many connections as desired, regardless of connlimit limits.
Without TCP window tracking in conntrack, *any* RST segment (with proper
src/dst ip/port, of course) would destroy the conntrack entry. With window
tracking enabled (the default) we can maintain a window of the sequence
numbers which are accepted and processed by conntrack. Due to the fact
that the firewall sits in the middle and packets which have been seen by
the firewall may get lost or even reordered in transit to the destination,
it is inpossible to calculate the *exact* window sizes of the two end
points. Therefore the window in conntrack wider and conntrack may process
packets which otherwise are outside of the window of the receiver.
> I wrote a little perl script that I can leave running in the background on the
> client to send the necessary RST packets. In my testing, it does allow me to
> bypass connlimit restrictions on a server:
>
> http://www.tigertech.net/patches/rawip.pl
>
> This seems to make connlimit less useful than I'd previously believed. Am I
> just misunderstanding something?
No, you are correct. If you want to eliminate the possibility to bypass
connlimit with properly crafted RST segments, probably you should use the
recent match and count the created NEW connections.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 7:19 ` Jozsef Kadlecsik
@ 2009-05-20 7:31 ` Philip Craig
2009-05-20 7:42 ` Jozsef Kadlecsik
2009-05-20 20:24 ` Robert L Mathews
1 sibling, 1 reply; 22+ messages in thread
From: Philip Craig @ 2009-05-20 7:31 UTC (permalink / raw)
To: Jozsef Kadlecsik; +Cc: Robert L Mathews, netfilter
Jozsef Kadlecsik wrote:
> Without TCP window tracking in conntrack, *any* RST segment (with proper
> src/dst ip/port, of course) would destroy the conntrack entry. With window
> tracking enabled (the default) we can maintain a window of the sequence
> numbers which are accepted and processed by conntrack. Due to the fact
> that the firewall sits in the middle and packets which have been seen by
> the firewall may get lost or even reordered in transit to the destination,
> it is inpossible to calculate the *exact* window sizes of the two end
> points. Therefore the window in conntrack wider and conntrack may process
> packets which otherwise are outside of the window of the receiver.
Is this the same reason why the window tracking accepts pure acks
without checking the sequence?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 7:31 ` Philip Craig
@ 2009-05-20 7:42 ` Jozsef Kadlecsik
2009-05-20 8:06 ` Philip Craig
0 siblings, 1 reply; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-20 7:42 UTC (permalink / raw)
To: Philip Craig; +Cc: Robert L Mathews, netfilter
On Wed, 20 May 2009, Philip Craig wrote:
> Jozsef Kadlecsik wrote:
> > Without TCP window tracking in conntrack, *any* RST segment (with proper
> > src/dst ip/port, of course) would destroy the conntrack entry. With window
> > tracking enabled (the default) we can maintain a window of the sequence
> > numbers which are accepted and processed by conntrack. Due to the fact
> > that the firewall sits in the middle and packets which have been seen by
> > the firewall may get lost or even reordered in transit to the destination,
> > it is inpossible to calculate the *exact* window sizes of the two end
> > points. Therefore the window in conntrack wider and conntrack may process
> > packets which otherwise are outside of the window of the receiver.
>
> Is this the same reason why the window tracking accepts pure acks
> without checking the sequence?
You mean, when the ack flag is not set in the packet, we handle it as it
was set and had a proper ack field? What else could be done? :-)
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 7:42 ` Jozsef Kadlecsik
@ 2009-05-20 8:06 ` Philip Craig
2009-05-20 8:43 ` Jozsef Kadlecsik
0 siblings, 1 reply; 22+ messages in thread
From: Philip Craig @ 2009-05-20 8:06 UTC (permalink / raw)
To: Jozsef Kadlecsik; +Cc: Robert L Mathews, netfilter
Jozsef Kadlecsik wrote:
> You mean, when the ack flag is not set in the packet, we handle it as it
> was set and had a proper ack field? What else could be done? :-)
No, I mean when there the ack flag is set, but there is no data,
as handled by this code:
if (seq == end
&& (!tcph->rst
|| (seq == 0 && state->state == TCP_CONNTRACK_SYN_SENT)))
/*
* Packets contains no data: we assume it is valid
* and check the ack value only.
* However RST segments are always validated by their
* SEQ number, except when seq == 0 (reset sent answering
* SYN.
*/
seq = end = sender->td_end;
We've encountered this in practice where a 'tcp accelerator' was
creating a new tcp connection with all the same port numbers, but
a different sequence number, and the tcp conntrack was accepting
a pure ack packet as part of the old connection, even though the
sequence number was wrong. This setup won't work no matter what
tcp conntrack does of course, but it did complicate working out
what was going on.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 8:06 ` Philip Craig
@ 2009-05-20 8:43 ` Jozsef Kadlecsik
0 siblings, 0 replies; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-20 8:43 UTC (permalink / raw)
To: Philip Craig; +Cc: Robert L Mathews, netfilter
On Wed, 20 May 2009, Philip Craig wrote:
> Jozsef Kadlecsik wrote:
> > You mean, when the ack flag is not set in the packet, we handle it as it
> > was set and had a proper ack field? What else could be done? :-)
>
> No, I mean when there the ack flag is set, but there is no data,
> as handled by this code:
>
> if (seq == end
> && (!tcph->rst
> || (seq == 0 && state->state == TCP_CONNTRACK_SYN_SENT)))
> /*
> * Packets contains no data: we assume it is valid
> * and check the ack value only.
> * However RST segments are always validated by their
> * SEQ number, except when seq == 0 (reset sent answering
> * SYN.
> */
> seq = end = sender->td_end;
>
> We've encountered this in practice where a 'tcp accelerator' was
> creating a new tcp connection with all the same port numbers, but
> a different sequence number, and the tcp conntrack was accepting
> a pure ack packet as part of the old connection, even though the
> sequence number was wrong. This setup won't work no matter what
> tcp conntrack does of course, but it did complicate working out
> what was going on.
I see. The rationale behind not checking the sequence number in this case
is that there's no data in the packet. If the packet is out of the window
of the receiver, it'll answer with an ack with the proper seq, ack values.
But it can be argued that conntrack should still check the sequence number
of dataless packets too :-).
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 7:19 ` Jozsef Kadlecsik
2009-05-20 7:31 ` Philip Craig
@ 2009-05-20 20:24 ` Robert L Mathews
2009-05-20 21:40 ` Jozsef Kadlecsik
1 sibling, 1 reply; 22+ messages in thread
From: Robert L Mathews @ 2009-05-20 20:24 UTC (permalink / raw)
To: netfilter
Jozsef Kadlecsik wrote:
> No, you are correct.
Hmmm, okay. I must say I'm a little surprised by that. I've seen plenty
of people using connlimit and connbytes (for example) to protect against
all kinds of things, and I don't think it's widely known that it's
trivial for an attacker to bypass those restrictions.
Anyway, though:
>If you want to eliminate the possibility to bypass
> connlimit with properly crafted RST segments, probably you should use the
> recent match and count the created NEW connections.
My goal with connlimit is to limit simultaneous connections so that it
prevents a single client from using up all the Apache process slots.
However, I don't want to limit how many connections they can open in a
period of time.
For example, it's perfectly fine for someone to open, say, 500
connections per minute, as long as they don't open more than 40 at a
time. But I do need to block the 41st simultaneous connection even from
people who open up connections very slowly, such as someone who opens up
just five connections per hour and never closes them.
Is that something the "recent" feature can help with? I'm not seeing how
that's possible, but perhaps I'm missing something.
Thanks again for the help!
--
Robert L Mathews, Tiger Technologies http://www.tigertech.net/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 20:24 ` Robert L Mathews
@ 2009-05-20 21:40 ` Jozsef Kadlecsik
2009-05-21 8:17 ` Anatoly Muliarski
2009-05-21 15:31 ` Jozsef Kadlecsik
0 siblings, 2 replies; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-20 21:40 UTC (permalink / raw)
To: Robert L Mathews; +Cc: netfilter, netfilter-devel
[Cc-ing netfilter-devel]
On Wed, 20 May 2009, Robert L Mathews wrote:
> > It seems that all an attacker needs to do is send invalid RST packets
> > with a sequence number one less than the last ACK received from the
> > server. Then conntrack will forget about the connection, allowing the
> > attacker to open as many connections as desired, regardless of
> > connlimit limits.
> >
> > I wrote a little perl script that I can leave running in the
> > background on the client to send the necessary RST packets. In my
> > testing, it does allow me to bypass connlimit restrictions on a
> > server:
> >
> > http://www.tigertech.net/patches/rawip.pl
> >
> > This seems to make connlimit less useful than I'd previously believed.
> > Am I just misunderstanding something?
>
> > No, you are correct.
>
> Hmmm, okay. I must say I'm a little surprised by that. I've seen plenty
> of people using connlimit and connbytes (for example) to protect against
> all kinds of things, and I don't think it's widely known that it's
> trivial for an attacker to bypass those restrictions.
I think because it is *not* widely known. The credit is yours for
discovering how to bypass connlimit/connbytes.
> > If you want to eliminate the possibility to bypass connlimit with
> > properly crafted RST segments, probably you should use the recent
> > match and count the created NEW connections.
>
> My goal with connlimit is to limit simultaneous connections so that it
> prevents a single client from using up all the Apache process slots.
>
> However, I don't want to limit how many connections they can open in a
> period of time.
>
> For example, it's perfectly fine for someone to open, say, 500
> connections per minute, as long as they don't open more than 40 at a
> time. But I do need to block the 41st simultaneous connection even from
> people who open up connections very slowly, such as someone who opens up
> just five connections per hour and never closes them.
>
> Is that something the "recent" feature can help with? I'm not seeing how
> that's possible, but perhaps I'm missing something.
No, that's not possible with "recent".
Because connlimit/connbytes rely on conntrack, the latter should be
"fixed". However I do not see any way to make it resistant against such
attacks: if we shrink the window (by which alogrithm?) we may block valid
RST segments and thus cause connections to hang instead of termination.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 21:40 ` Jozsef Kadlecsik
@ 2009-05-21 8:17 ` Anatoly Muliarski
2009-05-21 9:11 ` Jozsef Kadlecsik
2009-05-21 15:31 ` Jozsef Kadlecsik
1 sibling, 1 reply; 22+ messages in thread
From: Anatoly Muliarski @ 2009-05-21 8:17 UTC (permalink / raw)
To: netfilter
I would like to put in some words.
Obviously the problem is in conntrack code.
IMHO, to solve this issue the code should track tcp sequence number
and check it correctness on receiving RST packet and on the following
decision about removing the conntrack entry.
2009/5/21, Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>:
>
> Because connlimit/connbytes rely on conntrack, the latter should be
> "fixed". However I do not see any way to make it resistant against such
> attacks: if we shrink the window (by which alogrithm?) we may block valid
> RST segments and thus cause connections to hang instead of termination.
>
> Best regards,
> Jozsef
--
Best regards
Anatoly Muliarski
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-21 8:17 ` Anatoly Muliarski
@ 2009-05-21 9:11 ` Jozsef Kadlecsik
2009-05-21 18:07 ` Robert L Mathews
0 siblings, 1 reply; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-21 9:11 UTC (permalink / raw)
To: Anatoly Muliarski; +Cc: netfilter
On Thu, 21 May 2009, Anatoly Muliarski wrote:
> > Because connlimit/connbytes rely on conntrack, the latter should be
> > "fixed". However I do not see any way to make it resistant against such
> > attacks: if we shrink the window (by which alogrithm?) we may block valid
> > RST segments and thus cause connections to hang instead of termination.
>
> I would like to put in some words.
> Obviously the problem is in conntrack code.
> IMHO, to solve this issue the code should track tcp sequence number
> and check it correctness on receiving RST packet and on the following
> decision about removing the conntrack entry.
The TCP sequence numbers *are* tracked and checked - but with the limit of
a node being in the middle of the two communicating endpoints. That limit
is physical and cannot be discarded.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-20 21:40 ` Jozsef Kadlecsik
2009-05-21 8:17 ` Anatoly Muliarski
@ 2009-05-21 15:31 ` Jozsef Kadlecsik
2009-05-21 18:45 ` Robert L Mathews
1 sibling, 1 reply; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-21 15:31 UTC (permalink / raw)
To: Robert L Mathews; +Cc: netfilter, netfilter-devel
On Wed, 20 May 2009, Jozsef Kadlecsik wrote:
> Because connlimit/connbytes rely on conntrack, the latter should be
> "fixed". However I do not see any way to make it resistant against such
> attacks: if we shrink the window (by which alogrithm?) we may block valid
> RST segments and thus cause connections to hang instead of termination.
OK, here is a patch. Could you test it with your script and in your
environment?
The patch below introduces a new flag for TCP conntrack to mark that RST
segment was seen. If retransmitted packets detected from the other
direction after the RST segment detected, the timeout of the conntrack
entry is linearly increased up to a hardcoded value. Thus we can both
catch the retransmitted packets and preserve the effectiveness of
connlimit/connbytes.
---
diff --git a/include/linux/netfilter/nf_conntrack_tcp.h b/include/linux/netfilter/nf_conntrack_tcp.h
index 3066789..465d346 100644
--- a/include/linux/netfilter/nf_conntrack_tcp.h
+++ b/include/linux/netfilter/nf_conntrack_tcp.h
@@ -35,6 +35,9 @@ enum tcp_conntrack {
/* Has unacknowledged data */
#define IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED 0x10
+/* Has seen RST */
+#define IP_CT_TCP_FLAG_RST_SEEN 0x20
+
struct nf_ct_tcp_flags {
__u8 flags;
__u8 mask;
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index b5ccf2b..fca7caa 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -84,6 +84,10 @@ static unsigned int tcp_timeouts[TCP_CONNTRACK_MAX] __read_mostly = {
[TCP_CONNTRACK_CLOSE] = 10 SECS,
};
+/* Max timeout when retransmitted packets detected after RST was seen
+ from the other direction */
+#define TCP_TIMEOUT_RETRANS_AFTER_RST 2 MINS
+
#define sNO TCP_CONNTRACK_NONE
#define sSS TCP_CONNTRACK_SYN_SENT
#define sSR TCP_CONNTRACK_SYN_RECV
@@ -918,6 +922,8 @@ static int tcp_packet(struct nf_conn *ct,
"nf_ct_tcp: invalid state ");
return -NF_ACCEPT;
case TCP_CONNTRACK_CLOSE:
+ if (index == TCP_RST_SET)
+ ct->proto.tcp.seen[dir].flags |= IP_CT_TCP_FLAG_RST_SEEN;
if (index == TCP_RST_SET
&& ((test_bit(IPS_SEEN_REPLY_BIT, &ct->status)
&& ct->proto.tcp.last_index == TCP_SYN_SET)
@@ -963,7 +969,12 @@ static int tcp_packet(struct nf_conn *ct,
&& new_state == TCP_CONNTRACK_FIN_WAIT)
ct->proto.tcp.seen[dir].flags |= IP_CT_TCP_FLAG_CLOSE_INIT;
- if (ct->proto.tcp.retrans >= nf_ct_tcp_max_retrans &&
+ if (ct->proto.tcp.seen[!dir].flags & IP_CT_TCP_FLAG_RST_SEEN
+ && ct->proto.tcp.retrans > 1)
+ timeout = min_t(unsigned int,
+ tcp_timeouts[sCL] * ct->proto.tcp.retrans,
+ TCP_TIMEOUT_RETRANS_AFTER_RST);
+ else if (ct->proto.tcp.retrans >= nf_ct_tcp_max_retrans &&
tcp_timeouts[new_state] > nf_ct_tcp_timeout_max_retrans)
timeout = nf_ct_tcp_timeout_max_retrans;
else if ((ct->proto.tcp.seen[0].flags | ct->proto.tcp.seen[1].flags) &
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-21 9:11 ` Jozsef Kadlecsik
@ 2009-05-21 18:07 ` Robert L Mathews
0 siblings, 0 replies; 22+ messages in thread
From: Robert L Mathews @ 2009-05-21 18:07 UTC (permalink / raw)
To: netfilter
Jozsef Kadlecsik wrote:
> The TCP sequence numbers *are* tracked and checked - but with the limit of
> a node being in the middle of the two communicating endpoints. That limit
> is physical and cannot be discarded.
That said, would it be safe to say that if conntrack has already seen a
certain sequence number in one direction, then a RST packet with less
than sequence number + 1 is automatically invalid?
To take my original example, if the client sends a legitimate packet
with sequence number 421, then sends a following RST packet that also
has sequence number 421, the RST packet *must* be bogus, and will
(hopefully) be ignored by the server.
I realize that in practice, when our node-in-the-middle forwards the two
seq=421 packets to the server, the first ("real") packet could be lost.
Or perhaps the two packets arrive at the server in the other order.
Either case will cause the server to see the bogus RST packet before it
sees the "real" packet, and the server will "incorrectly" accept the
bogus RST. And I don't see a way for conntrack to detect whether this
happened.
However, it seems much more likely that the server will "correctly"
reject the RST. Perhaps conntrack should assume that, instead? It would
be correct more often, and it's also less harmful if we're wrong: when
we receive data that violates the TCP standard, it seems better for
conntrack to decide that a connection is still open when it's not, than
to decide that a connection is closed when it isn't.
--
Robert L Mathews, Tiger Technologies http://www.tigertech.net/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-21 15:31 ` Jozsef Kadlecsik
@ 2009-05-21 18:45 ` Robert L Mathews
2009-05-22 4:32 ` Anatoly Muliarski
2009-05-22 7:42 ` Jozsef Kadlecsik
0 siblings, 2 replies; 22+ messages in thread
From: Robert L Mathews @ 2009-05-21 18:45 UTC (permalink / raw)
To: netfilter; +Cc: netfilter-devel
Jozsef Kadlecsik wrote:
> On Wed, 20 May 2009, Jozsef Kadlecsik wrote:
>
>> Because connlimit/connbytes rely on conntrack, the latter should be
>> "fixed". However I do not see any way to make it resistant against such
>> attacks: if we shrink the window (by which alogrithm?) we may block valid
>> RST segments and thus cause connections to hang instead of termination.
>
> OK, here is a patch. Could you test it with your script and in your
> environment?
>
> The patch below introduces a new flag for TCP conntrack to mark that RST
> segment was seen. If retransmitted packets detected from the other
> direction after the RST segment detected, the timeout of the conntrack
> entry is linearly increased up to a hardcoded value. Thus we can both
> catch the retransmitted packets and preserve the effectiveness of
> connlimit/connbytes.
Perhaps I'm misunderstanding, but I don't think this will fix it.
Although the original example I gave involved TCP retransmits that
"reanimated" the connection, the problem is unfortunately not limited to
that case, and there don't need to be any TCP retransmits involved. (The
subject of this thread is now a little misleading and overly-specific,
unfortunately.)
Here's the opening of a telnet connection that I injected a bogus RST
packet into (the packet has sequence 522209353 instead of the correct
522209354):
client.52665 > server.23: S 522209223:522209223(0) win 5840 <mss
1460,sackOK,nop,wscale 7>
server.23 > client.52665: S 3233007698:3233007698(0) ack 522209224 win
5792 <mss 1460,sackOK,nop,wscale 6>
client.52665 > server.23: . ack 3233007699 win 46 <nop,nop>
server.23 > client.52665: P 3233007699:3233007711(12) ack 522209224 win
91 <nop,nop>
client.52665 > server.23: . ack 3233007711 win 46 <nop,nop>
client.52665 > server.23: P 522209224:522209236(12) ack 3233007711 win
46 <nop,nop>
server.23 > client.52665: . ack 522209236 win 91 <nop,nop>
server.23 > client.52665: P 3233007711:3233007735(24) ack 522209236 win
91 <nop,nop>
client.52665 > server.23: . ack 3233007735 win 46 <nop,nop>
client.52665 > server.23: P 522209236:522209327(91) ack 3233007735 win
46 <nop,nop>
server.23 > client.52665: P 3233007735:3233007750(15) ack 522209327 win
91 <nop,nop>
client.52665 > server.23: P 522209327:522209351(24) ack 3233007750 win
46 <nop,nop>
server.23 > client.52665: P 3233007750:3233007753(3) ack 522209351 win
91 <nop,nop>
client.52665 > server.23: P 522209351:522209354(3) ack 3233007753 win 46
<nop,nop>
server.23 > client.52665: P 3233007753:3233007947(194) ack 522209354 win
91 <nop,nop>
client.52665 > server.23: R 522209353:522209353(0) win 65535
client.52665 > server.23: . ack 3233007947 win 54 <nop,nop>
At this point, the telnet connection with the server is fully
established and waiting for me to type something (because the server
ignored the bogus RST). But conntrack incorrectly considers it CLOSEd
(because conntrack didn't ignore the RST). No retransmits were involved,
though.
Even if nf_conntrack_tcp_loose is true, and conntrack treats subsequent
packets as a new connection when I type something in telnet, I can make
it forget about the connection again by sending another bogus RST. If I
send a bogus RST after every legitimate packet, conntrack will almost
always think the open connection is actually closed.
Since no retransmits are necessary, I don't think a solution that looks
for retransmits will help, unfortunately.
--
Robert L Mathews, Tiger Technologies http://www.tigertech.net/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-21 18:45 ` Robert L Mathews
@ 2009-05-22 4:32 ` Anatoly Muliarski
2009-05-22 7:21 ` Jozsef Kadlecsik
2009-05-22 7:42 ` Jozsef Kadlecsik
1 sibling, 1 reply; 22+ messages in thread
From: Anatoly Muliarski @ 2009-05-22 4:32 UTC (permalink / raw)
To: Robert L Mathews; +Cc: netfilter
2009/5/21 Robert L Mathews <lists@tigertech.com>:
>
> Perhaps I'm misunderstanding, but I don't think this will fix it.
>
> Although the original example I gave involved TCP retransmits that
> "reanimated" the connection, the problem is unfortunately not limited to
> that case, and there don't need to be any TCP retransmits involved. (The
> subject of this thread is now a little misleading and overly-specific,
> unfortunately.)
>
> Here's the opening of a telnet connection that I injected a bogus RST packet
> into (the packet has sequence 522209353 instead of the correct 522209354):
>
> client.52665 > server.23: S 522209223:522209223(0) win 5840 <mss
> 1460,sackOK,nop,wscale 7>
> server.23 > client.52665: S 3233007698:3233007698(0) ack 522209224 win 5792
> <mss 1460,sackOK,nop,wscale 6>
> client.52665 > server.23: . ack 3233007699 win 46 <nop,nop>
> server.23 > client.52665: P 3233007699:3233007711(12) ack 522209224 win 91
> <nop,nop>
> client.52665 > server.23: . ack 3233007711 win 46 <nop,nop>
> client.52665 > server.23: P 522209224:522209236(12) ack 3233007711 win 46
> <nop,nop>
> server.23 > client.52665: . ack 522209236 win 91 <nop,nop>
> server.23 > client.52665: P 3233007711:3233007735(24) ack 522209236 win 91
> <nop,nop>
> client.52665 > server.23: . ack 3233007735 win 46 <nop,nop>
> client.52665 > server.23: P 522209236:522209327(91) ack 3233007735 win 46
> <nop,nop>
> server.23 > client.52665: P 3233007735:3233007750(15) ack 522209327 win 91
> <nop,nop>
> client.52665 > server.23: P 522209327:522209351(24) ack 3233007750 win 46
> <nop,nop>
> server.23 > client.52665: P 3233007750:3233007753(3) ack 522209351 win 91
> <nop,nop>
> client.52665 > server.23: P 522209351:522209354(3) ack 3233007753 win 46
> <nop,nop>
> server.23 > client.52665: P 3233007753:3233007947(194) ack 522209354 win 91
> <nop,nop>
> client.52665 > server.23: R 522209353:522209353(0) win 65535
> client.52665 > server.23: . ack 3233007947 win 54 <nop,nop>
>
> At this point, the telnet connection with the server is fully established
> and waiting for me to type something (because the server ignored the bogus
> RST). But conntrack incorrectly considers it CLOSEd (because conntrack
> didn't ignore the RST). No retransmits were involved, though.
>
> Even if nf_conntrack_tcp_loose is true, and conntrack treats subsequent
> packets as a new connection when I type something in telnet, I can make it
> forget about the connection again by sending another bogus RST. If I send a
> bogus RST after every legitimate packet, conntrack will almost always think
> the open connection is actually closed.
>
> Since no retransmits are necessary, I don't think a solution that looks for
> retransmits will help, unfortunately.
>
> --
> Robert L Mathews, Tiger Technologies http://www.tigertech.net/
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
The problem consists in that RST-packet with wrong sequence number
must be treated as INVALID.
IMHO, the conntrack code should be changed in this way.
I wrote a fix( sorry, I don't experienced in kernel tcp stack -
correct me if I somewhere wrong).
*** nf_conntrack_proto_tcp.c 2009-05-22 07:22:43.000000000 +0300
--- /usr/src/linux/1/nf_conntrack_proto_tcp.c 2008-09-08
13:20:51.000000000 +0300
*************** static int tcp_packet(struct nf_conn *ct
*** 969,979 ****
problem case, so we can delete the conntrack
immediately. --RR */
if (th->rst) {
! if (ntohl(th->seq) < ct->proto.tcp.last_seq) {
! ct->proto.tcp.state = old_state;
! return -NF_DROP;
! }
! else if (del_timer(&ct->timeout))
ct->timeout.function((unsigned long)ct);
return NF_ACCEPT;
}
--- 969,975 ----
problem case, so we can delete the conntrack
immediately. --RR */
if (th->rst) {
! if (del_timer(&ct->timeout))
ct->timeout.function((unsigned long)ct);
return NF_ACCEPT;
}
--
Best regards
Anatoly Muliarski
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-22 4:32 ` Anatoly Muliarski
@ 2009-05-22 7:21 ` Jozsef Kadlecsik
2009-05-22 8:26 ` Anatoly Muliarski
0 siblings, 1 reply; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-22 7:21 UTC (permalink / raw)
To: Anatoly Muliarski; +Cc: Robert L Mathews, netfilter
On Fri, 22 May 2009, Anatoly Muliarski wrote:
> The problem consists in that RST-packet with wrong sequence number
> must be treated as INVALID.
> IMHO, the conntrack code should be changed in this way.
>
> I wrote a fix( sorry, I don't experienced in kernel tcp stack -
> correct me if I somewhere wrong).
>
> *** nf_conntrack_proto_tcp.c 2009-05-22 07:22:43.000000000 +0300
> --- /usr/src/linux/1/nf_conntrack_proto_tcp.c 2008-09-08
> 13:20:51.000000000 +0300
> *************** static int tcp_packet(struct nf_conn *ct
> *** 969,979 ****
> problem case, so we can delete the conntrack
> immediately. --RR */
> if (th->rst) {
> ! if (ntohl(th->seq) < ct->proto.tcp.last_seq) {
> ! ct->proto.tcp.state = old_state;
> ! return -NF_DROP;
> ! }
> ! else if (del_timer(&ct->timeout))
> ct->timeout.function((unsigned long)ct);
> return NF_ACCEPT;
> }
> --- 969,975 ----
> problem case, so we can delete the conntrack
> immediately. --RR */
> if (th->rst) {
> ! if (del_timer(&ct->timeout))
> ct->timeout.function((unsigned long)ct);
> return NF_ACCEPT;
> }
That won't work: the packet which sequence number was recorded in last_seq
may get lost in transit to the destination and we may receive a *valid*
RST with a sequence number less than the one in last_seq.
This is the main problem: we can never be sure the packets which are seen
by firewall do really reach the destination or they order is preserved.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-21 18:45 ` Robert L Mathews
2009-05-22 4:32 ` Anatoly Muliarski
@ 2009-05-22 7:42 ` Jozsef Kadlecsik
1 sibling, 0 replies; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-22 7:42 UTC (permalink / raw)
To: Robert L Mathews; +Cc: netfilter, netfilter-devel
On Thu, 21 May 2009, Robert L Mathews wrote:
> Jozsef Kadlecsik wrote:
> > On Wed, 20 May 2009, Jozsef Kadlecsik wrote:
> >
> > > Because connlimit/connbytes rely on conntrack, the latter should be
> > > "fixed". However I do not see any way to make it resistant against such
> > > attacks: if we shrink the window (by which alogrithm?) we may block valid
> > > RST segments and thus cause connections to hang instead of termination.
> >
> > OK, here is a patch. Could you test it with your script and in your
> > environment?
> >
> > The patch below introduces a new flag for TCP conntrack to mark that RST
> > segment was seen. If retransmitted packets detected from the other direction
> > after the RST segment detected, the timeout of the conntrack entry is
> > linearly increased up to a hardcoded value. Thus we can both catch the
> > retransmitted packets and preserve the effectiveness of connlimit/connbytes.
>
> Perhaps I'm misunderstanding, but I don't think this will fix it.
>
> Although the original example I gave involved TCP retransmits that
> "reanimated" the connection, the problem is unfortunately not limited to that
> case, and there don't need to be any TCP retransmits involved. (The subject of
> this thread is now a little misleading and overly-specific, unfortunately.)
Yes, I was concentrating on the original example.
> Here's the opening of a telnet connection that I injected a bogus RST packet
> into (the packet has sequence 522209353 instead of the correct 522209354):
>
> client.52665 > server.23: S 522209223:522209223(0) win 5840 <mss
> 1460,sackOK,nop,wscale 7>
> server.23 > client.52665: S 3233007698:3233007698(0) ack 522209224 win 5792
> <mss 1460,sackOK,nop,wscale 6>
> client.52665 > server.23: . ack 3233007699 win 46 <nop,nop>
> server.23 > client.52665: P 3233007699:3233007711(12) ack 522209224 win 91
> <nop,nop>
> client.52665 > server.23: . ack 3233007711 win 46 <nop,nop>
> client.52665 > server.23: P 522209224:522209236(12) ack 3233007711 win 46
> <nop,nop>
> server.23 > client.52665: . ack 522209236 win 91 <nop,nop>
> server.23 > client.52665: P 3233007711:3233007735(24) ack 522209236 win 91
> <nop,nop>
> client.52665 > server.23: . ack 3233007735 win 46 <nop,nop>
> client.52665 > server.23: P 522209236:522209327(91) ack 3233007735 win 46
> <nop,nop>
> server.23 > client.52665: P 3233007735:3233007750(15) ack 522209327 win 91
> <nop,nop>
> client.52665 > server.23: P 522209327:522209351(24) ack 3233007750 win 46
> <nop,nop>
> server.23 > client.52665: P 3233007750:3233007753(3) ack 522209351 win 91
> <nop,nop>
> client.52665 > server.23: P 522209351:522209354(3) ack 3233007753 win 46
> <nop,nop>
> server.23 > client.52665: P 3233007753:3233007947(194) ack 522209354 win 91
> <nop,nop>
> client.52665 > server.23: R 522209353:522209353(0) win 65535
> client.52665 > server.23: . ack 3233007947 win 54 <nop,nop>
>
> At this point, the telnet connection with the server is fully established and
> waiting for me to type something (because the server ignored the bogus RST).
> But conntrack incorrectly considers it CLOSEd (because conntrack didn't ignore
> the RST). No retransmits were involved, though.
>
> Even if nf_conntrack_tcp_loose is true, and conntrack treats subsequent
> packets as a new connection when I type something in telnet, I can make it
> forget about the connection again by sending another bogus RST. If I send a
> bogus RST after every legitimate packet, conntrack will almost always think
> the open connection is actually closed.
But if nf_conntrack_tcp_loose is disabled, then the conntrack entry is
destroyed and the connection won't be "picked up" at seeing the normal
packets after the bogus RST.
I do not think that it's a problem that an attacker can destroy conntrack
entries by forged RST segments: it cannot be prevented. However the
attacker *can* open up a new connection again and thus overcome the
limitation of connlimit.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-22 7:21 ` Jozsef Kadlecsik
@ 2009-05-22 8:26 ` Anatoly Muliarski
2009-05-22 8:54 ` Jozsef Kadlecsik
0 siblings, 1 reply; 22+ messages in thread
From: Anatoly Muliarski @ 2009-05-22 8:26 UTC (permalink / raw)
To: Jozsef Kadlecsik; +Cc: netfilter
2009/5/22, Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>:
> On Fri, 22 May 2009, Anatoly Muliarski wrote:
>
> > The problem consists in that RST-packet with wrong sequence number
> > must be treated as INVALID.
> > IMHO, the conntrack code should be changed in this way.
> >
> > I wrote a fix( sorry, I don't experienced in kernel tcp stack -
> > correct me if I somewhere wrong).
> >
> > *** nf_conntrack_proto_tcp.c 2009-05-22 07:22:43.000000000 +0300
> > --- /usr/src/linux/1/nf_conntrack_proto_tcp.c 2008-09-08
> > 13:20:51.000000000 +0300
> > *************** static int tcp_packet(struct nf_conn *ct
> > *** 969,979 ****
> > problem case, so we can delete the conntrack
> > immediately. --RR */
> > if (th->rst) {
> > ! if (ntohl(th->seq) < ct->proto.tcp.last_seq) {
> > ! ct->proto.tcp.state = old_state;
> > ! return -NF_DROP;
> > ! }
> > ! else if (del_timer(&ct->timeout))
> > ct->timeout.function((unsigned long)ct);
> > return NF_ACCEPT;
> > }
> > --- 969,975 ----
> > problem case, so we can delete the conntrack
> > immediately. --RR */
> > if (th->rst) {
> > ! if (del_timer(&ct->timeout))
> > ct->timeout.function((unsigned long)ct);
> > return NF_ACCEPT;
> > }
>
> That won't work: the packet which sequence number was recorded in last_seq
> may get lost in transit to the destination and we may receive a *valid*
> RST with a sequence number less than the one in last_seq.
>
> This is the main problem: we can never be sure the packets which are seen
> by firewall do really reach the destination or they order is preserved.
>
> Best regards,
> Jozsef
OK.
We could save the LAST sequence number as a current one.
So we keep the connection and mark the current RST as invalid and
correctly react on the following ones. Unfortunately this does not
solve the main problem - unable to know whether the received sequence
number is valid or not. As an vague idea - we could track the ack
number from other direction and so keep the last delivered sequence
number. What can say about it?
I added a line in my patch.
*** nf_conntrack_proto_tcp.c 2009-05-22 11:00:29.000000000 +0300
--- /usr/src/linux/1/nf_conntrack_proto_tcp.c 2008-09-08
13:20:51.000000000 +0300
*************** static int tcp_packet(struct nf_conn *ct
*** 969,980 ****
problem case, so we can delete the conntrack
immediately. --RR */
if (th->rst) {
! if (ntohl(th->seq) < ct->proto.tcp.last_seq) {
! ct->proto.tcp.state = old_state;
! ct->proto.tcp.last_seq = ntohl(th->seq);
! return -NF_DROP;
! }
! else if (del_timer(&ct->timeout))
ct->timeout.function((unsigned long)ct);
return NF_ACCEPT;
}
--- 969,975 ----
problem case, so we can delete the conntrack
immediately. --RR */
if (th->rst) {
! if (del_timer(&ct->timeout))
ct->timeout.function((unsigned long)ct);
return NF_ACCEPT;
}
--
Best regards
Anatoly Muliarski
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-22 8:26 ` Anatoly Muliarski
@ 2009-05-22 8:54 ` Jozsef Kadlecsik
2009-05-22 11:27 ` Jozsef Kadlecsik
0 siblings, 1 reply; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-22 8:54 UTC (permalink / raw)
To: Anatoly Muliarski; +Cc: netfilter, netfilter-devel
On Fri, 22 May 2009, Anatoly Muliarski wrote:
> > This is the main problem: we can never be sure the packets which are seen
> > by firewall do really reach the destination or they order is preserved.
> We could save the LAST sequence number as a current one.
> So we keep the connection and mark the current RST as invalid and
> correctly react on the following ones. Unfortunately this does not
> solve the main problem - unable to know whether the received sequence
> number is valid or not. As an vague idea - we could track the ack
> number from other direction and so keep the last delivered sequence
> number. What can say about it?
Relying on the last ACK received from the other direction looks promising.
We record the last (highest) ACK sent by both endpoints, which makes sure
the packet they ack they did indeed received. And we accept a RST segment
only if it's in the window we calculate (wider than the destination's) AND
equal or higher than the saved last ACK from the other direction.
The only downside is that new fields must be added to struct ip_ct_tcp.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: conntrack and RSTs received during CLOSE_WAIT
2009-05-22 8:54 ` Jozsef Kadlecsik
@ 2009-05-22 11:27 ` Jozsef Kadlecsik
0 siblings, 0 replies; 22+ messages in thread
From: Jozsef Kadlecsik @ 2009-05-22 11:27 UTC (permalink / raw)
To: Robert L Mathews, Anatoly Muliarski; +Cc: netfilter, netfilter-devel
On Fri, 22 May 2009, Jozsef Kadlecsik wrote:
> On Fri, 22 May 2009, Anatoly Muliarski wrote:
>
> > > This is the main problem: we can never be sure the packets which are seen
> > > by firewall do really reach the destination or they order is preserved.
>
> > We could save the LAST sequence number as a current one.
> > So we keep the connection and mark the current RST as invalid and
> > correctly react on the following ones. Unfortunately this does not
> > solve the main problem - unable to know whether the received sequence
> > number is valid or not. As an vague idea - we could track the ack
> > number from other direction and so keep the last delivered sequence
> > number. What can say about it?
>
> Relying on the last ACK received from the other direction looks promising.
> We record the last (highest) ACK sent by both endpoints, which makes sure
> the packet they ack they did indeed received. And we accept a RST segment
> only if it's in the window we calculate (wider than the destination's) AND
> equal or higher than the saved last ACK from the other direction.
>
> The only downside is that new fields must be added to struct ip_ct_tcp.
So here is the patch which adds checking RST segments against the highest
ack we seen from the other direction. I tested it with your script and
conntrack could resist receiving bogus RST segments.
diff --git a/include/linux/netfilter/nf_conntrack_tcp.h b/include/linux/netfilter/nf_conntrack_tcp.h
index 3066789..b2f384d 100644
--- a/include/linux/netfilter/nf_conntrack_tcp.h
+++ b/include/linux/netfilter/nf_conntrack_tcp.h
@@ -35,6 +35,9 @@ enum tcp_conntrack {
/* Has unacknowledged data */
#define IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED 0x10
+/* The field td_maxack has been set */
+#define IP_CT_TCP_FLAG_MAXACK_SET 0x20
+
struct nf_ct_tcp_flags {
__u8 flags;
__u8 mask;
@@ -46,6 +49,7 @@ struct ip_ct_tcp_state {
u_int32_t td_end; /* max of seq + len */
u_int32_t td_maxend; /* max of ack + max(win, 1) */
u_int32_t td_maxwin; /* max(win) */
+ u_int32_t td_maxack; /* max of ack */
u_int8_t td_scale; /* window scale factor */
u_int8_t flags; /* per direction options */
};
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index b5ccf2b..97a6e93 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -634,6 +634,14 @@ static bool tcp_in_window(const struct nf_conn *ct,
sender->td_end = end;
sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED;
}
+ if (tcph->ack) {
+ if (!(sender->flags & IP_CT_TCP_FLAG_MAXACK_SET)) {
+ sender->td_maxack = ack;
+ sender->flags |= IP_CT_TCP_FLAG_MAXACK_SET;
+ } else if (after(ack, sender->td_maxack))
+ sender->td_maxack = ack;
+ }
+
/*
* Update receiver data.
*/
@@ -919,6 +927,16 @@ static int tcp_packet(struct nf_conn *ct,
return -NF_ACCEPT;
case TCP_CONNTRACK_CLOSE:
if (index == TCP_RST_SET
+ && (ct->proto.tcp.seen[!dir].flags & IP_CT_TCP_FLAG_MAXACK_SET)
+ && before(ntohl(th->seq), ct->proto.tcp.seen[!dir].td_maxack)) {
+ /* Invalid RST */
+ write_unlock_bh(&tcp_lock);
+ if (LOG_INVALID(net, IPPROTO_TCP))
+ nf_log_packet(pf, 0, skb, NULL, NULL, NULL,
+ "nf_ct_tcp: invalid RST ");
+ return -NF_ACCEPT;
+ }
+ if (index == TCP_RST_SET
&& ((test_bit(IPS_SEEN_REPLY_BIT, &ct->status)
&& ct->proto.tcp.last_index == TCP_SYN_SET)
|| (!test_bit(IPS_ASSURED_BIT, &ct->status)
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply related [flat|nested] 22+ messages in thread
end of thread, other threads:[~2009-05-22 11:27 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-15 22:10 conntrack and RSTs received during CLOSE_WAIT Robert L Mathews
2009-05-16 21:57 ` Jozsef Kadlecsik
2009-05-17 3:09 ` Robert L Mathews
2009-05-20 5:16 ` Robert L Mathews
2009-05-20 7:19 ` Jozsef Kadlecsik
2009-05-20 7:31 ` Philip Craig
2009-05-20 7:42 ` Jozsef Kadlecsik
2009-05-20 8:06 ` Philip Craig
2009-05-20 8:43 ` Jozsef Kadlecsik
2009-05-20 20:24 ` Robert L Mathews
2009-05-20 21:40 ` Jozsef Kadlecsik
2009-05-21 8:17 ` Anatoly Muliarski
2009-05-21 9:11 ` Jozsef Kadlecsik
2009-05-21 18:07 ` Robert L Mathews
2009-05-21 15:31 ` Jozsef Kadlecsik
2009-05-21 18:45 ` Robert L Mathews
2009-05-22 4:32 ` Anatoly Muliarski
2009-05-22 7:21 ` Jozsef Kadlecsik
2009-05-22 8:26 ` Anatoly Muliarski
2009-05-22 8:54 ` Jozsef Kadlecsik
2009-05-22 11:27 ` Jozsef Kadlecsik
2009-05-22 7:42 ` Jozsef Kadlecsik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).