Linux Netfilter discussions
 help / color / mirror / Atom feed
* Weird nat/conntrack Problem with PASV FTP upload
@ 2008-06-05  9:02 Thomas Bätzler
  2008-06-05 13:59 ` Patrick McHardy
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Bätzler @ 2008-06-05  9:02 UTC (permalink / raw)
  To: netfilter

Hi & thank you for taking the time to have a look at this.

The basic setup is like this:

(ftp client)<=>(my nat box)<=big pipe=>(their nat box)<=>(ftp server)

The FTP client is PHP5's FTP library on a Debian Etch box with kernel 2.6.23 built from a Debian source package.
My NAT box is also Debian Etch, recently upgraded to 2.6.25 using the current Debian source package.
I Don't know much about the remote side, except that their FTP server is supposedly ProFTPd on Debian Etch.

We use PASV FTP transfers for our uploads and that's been working o.k. for us most of the time.

I say "most of the time" because we lose the data connection in about 1% of the transfers (mostly files in the 100kB to 5MB Range).

I've tcpdump'ed a some of those transfers on the external interface of my NAT box and on the client, and I don't understand what's going on. Let me give you an example:

tcpdump -rtttS on myclient:

000000 IP myclient.56785 > server.39790: SWE 427872165:427872165(0) win 5840 <mss 1460,sackOK,timestamp 481846634 0,nop,w
scale 7>
015646 IP server.39790 > myclient.56785: SE 2283192455:2283192455(0) ack 427872166 win 5792 <mss 1460,sackOK,timestamp 16317902 481846634,nop,wscale 7>
000010 IP myclient.56785 > server.39790: . ack 2283192456 win 46 <nop,nop,timestamp 481846636 16317902>
[...]
000002 IP myclient.56785 > server.39790: . 428128803:428131699(2896) ack 2283192456 win 46 <nop,nop,timestamp 481846649 16317934>
000784 IP server.39790 > myclient.56785: . ack 428044995 win 696 <nop,nop,timestamp 16317935 481846647,nop,nop,sack 1 {428046443:428047891}>
000006 IP myclient.56785 > server.39790: . 428131699:428134083(2384) ack 2283192456 win 46 <nop,nop,timestamp 481846649 16317935>
000002 IP server.39790 > myclient.56785: . ack 428047891 win 686 <nop,nop,timestamp 16317935 481846647>
000003 IP myclient.56785 > server.39790: . 428134083:428135699(1616) ack 2283192456 win 46 <nop,nop,timestamp 481846649 16317935>
000004 IP server.39790 > myclient.56785: . ack 428049339 win 675 <nop,nop,timestamp 16317935 481846647>

tcpdump -rtttS on natbox:

000000 IP mynatbox.56785 > server.39790: SWE 427872165:427872165(0) win 5840 <mss 1460,sackOK,timestamp 481846634 0,nop,wscale 7>
015564 IP server.39790 > mynatbox.56785: SE 2283192455:2283192455(0) ack 427872166 win 5792 <mss 1460,sackOK,timestamp 16317902 481846634,nop,wscale 7>
000062 IP mynatbox.56785 > server.39790: . ack 2283192456 win 46 <nop,nop,timestamp 481846636 16317902>
[...]
000004 IP mynatbox.56785 > server.39790: . 428128803:428130251(1448) ack 2283192456 win 46 <nop,nop,timestamp 481846649 16317934>
000034 IP mynatbox.56785 > server.39790: . 428130251:428131699(1448) ack 2283192456 win 46 <nop,nop,timestamp 481846649 16317934>
000560 IP server.39790 > mynatbox.56785: . ack 428042099 win 700 <nop,nop,timestamp 16317935 481846647,nop,nop,sack 1 {428046443:428047891}>
000020 IP mynatbox.56785 > server.39790: R 428042099:428042099(0) win 0
000002 IP server.39790 > mynatbox.56785: . ack 428042099 win 700 <nop,nop,timestamp 16317935 481846647,nop,nop,sack 2 {428043547:428044995}{428046443:428047891}>
000005 IP mynatbox.56785 > server.39790: R 428042099:428042099(0) win 0
000002 IP server.39790 > mynatbox.56785: . ack 428044995 win 696 <nop,nop,timestamp 16317935 481846647,nop,nop,sack 1 {428046443:428047891}>
000006 IP server.39790 > mynatbox.56785: . ack 428047891 win 686 <nop,nop,timestamp 16317935 481846647>
000005 IP server.39790 > mynatbox.56785: . ack 428049339 win 675 <nop,nop,timestamp 16317935 481846647>


Now I don't know why myclient thinks it's sending 2k+ byte segments, since its interface MTU is definitely 1500, and it also agreed on a mss of 1460. Since myclient's NIC is an e1000, it might be tcp segmentation offload at work.

No, what's really scaring me is that natbox tries to tear down the data connection for no apparent reason. Like in the example shown, it seems to happen mostly when server sends a selective ack for an out-of-order segment. Sometimes server just shrugs the rst off and keeps on acking data, but at other times it gives in and tears down the connection.

I'm grateful for any pointer or explanation you might have for me. Right now I'm at my wit's end.

TIA,
Thomas
-- 
BRINGE Informationstechnik GmbH
Zur Seeplatte 12
D-76228 Karlsruhe
Germany

Fon: +49 721 94246-0
Fon: +49 171 5438457
Fax: +49 721 94246-66
Web: http://www.bringe.de/

Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe
Ust.Id: DE812936645, HRB 108943 Mannheim 

^ permalink raw reply	[flat|nested] 21+ messages in thread
* RE: Weird nat/conntrack Problem with PASV FTP upload
@ 2008-06-09  8:58 Thomas Bätzler
  2008-06-09  9:14 ` Jozsef Kadlecsik
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Bätzler @ 2008-06-09  8:58 UTC (permalink / raw)
  To: netfilter

Patrick McHardy wrote:

> Thomas Bätzler wrote:
> > iptables -t nat -A PREROUTING -m state --state INVALID \
> >   -j  LOG
> > iptables -t nat -A PREROUTING -m state --state INVALID \
> >   -j DROP
> 
> These rules need to go in mangle, that nat table is only 
> traversed for the first packet of a connection.

I've changed my ruleset as you suggested, and now I'm seeing
packets being filtered. I'll wait and see how that's affecting
stability and throughput of the connection.

In any case I'm wondering why netfilter doesn't consider these
packets to be part of a connection. Is there a known problem
with netfilter and TCP SACK? Or did I miss something while
looking at the rejected packets? I've enabled logging via ulogd
now and have a look at what's being filtered now.

In the meantime thanks a lot for your help!

Cheers,
Thomas
-- 
BRINGE Informationstechnik GmbH
Zur Seeplatte 12
D-76228 Karlsruhe
Germany

Fon: +49 721 94246-0
Fon: +49 171 5438457
Fax: +49 721 94246-66
Web: http://www.bringe.de/

Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe
Ust.Id: DE812936645, HRB 108943 Mannheim 

^ permalink raw reply	[flat|nested] 21+ messages in thread
* RE: Weird nat/conntrack Problem with PASV FTP upload
@ 2008-06-09 10:36 Thomas Bätzler
  2008-06-09 11:21 ` Jozsef Kadlecsik
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Bätzler @ 2008-06-09 10:36 UTC (permalink / raw)
  To: netfilter

Jozsef Kadlecsik wrote:
> On Mon, 9 Jun 2008, Thomas Bätzler wrote:
>> In any case I'm wondering why netfilter doesn't consider
>> these packets  to be part of a connection. Is there a
>> known problem with netfilter and TCP SACK?
> 
> In these cases usually there is a device sitting between the 
> firewall running netfilter and the server/client machine, 
> which randomizes the TCP sequence numbers but fails to 
> propagate the changes to the SACK fields. 
> Thus the SACK values are totally bogus and therefore 
> netfilter marks them as INVALID.

I thought of that, too, but it doesn't seem to be the case.
Let's have a look at an excerpt (tcpdump -S):

22:37:44.830784 IP gateway.41803 > server.37890: SWE 1599996997:1599996997(0) win 5840 <mss 1460,sackOK,timestamp 495055487 0,nop,wscale 7>
22:37:44.846411 IP server.37890 > gateway.41803: SE 23582050:23582050(0) ack 1599996998 win 5792 <mss 1460,sackOK,timestamp 49338617 495055487,nop,wscale 7>
22:37:44.846533 IP gateway.41803 > server.37890: . ack 23582051 win 46 <nop,nop,timestamp 495055488 49338617>

[...]

22:37:44.974253 IP gateway.41803 > server.37890: . 1600282667:1600284115(1448) ack 23582051 win 46 <nop,nop,timestamp 495055501 49338649>

[...]

22:37:44.989794 IP server.37890 > gateway.41803: . ack 1600281219 win 892 <nop,nop,timestamp 49338653 495055501>
22:37:44.990228 IP gateway.41803 > server.37890: . 1600358775:1600360223(1448) ack 23582051 win 46 <nop,nop,timestamp 495055502 49338653>

[...]

22:37:44.990397 IP server.37890 > gateway.41803: . ack 1600281219 win 892 <nop,nop,timestamp 49338653 495055501,nop,nop,sack 1 {1600282667:1600284115}>
22:37:44.990417 IP gateway.41803 > server.37890: R 1600281219:1600281219(0) win 0

As you can see, the SACK data matches a previously sent segment,
so it's not scrambled.

HTH,
Thomas

^ permalink raw reply	[flat|nested] 21+ messages in thread
* RE: Weird nat/conntrack Problem with PASV FTP upload
@ 2008-06-09 12:35 Thomas Bätzler
  2008-06-09 12:53 ` Jozsef Kadlecsik
  2008-06-10  8:28 ` Jozsef Kadlecsik
  0 siblings, 2 replies; 21+ messages in thread
From: Thomas Bätzler @ 2008-06-09 12:35 UTC (permalink / raw)
  To: netfilter

Jozsef Kadlecsik wrote:
> Then the best were if you could capture a full TCP session by 
> tcpdump and send it so that we could replay and analyze the traffic.

I've uploaded an archive to http://baetzler.de/sandbox/dump.tar.bz2.
There is a complete tcp session of a file upload and a second dump
that contains a segment from that connection that was IMHO erroneously
logged/dropped by a rule that filters by state INVALID in the
PREROUTING chain of the mangle table.

Inititator is my NAT box, target is the FTP server.

TIA,
Thomas
-- 
BRINGE Informationstechnik GmbH
Zur Seeplatte 12
D-76228 Karlsruhe
Germany

Fon: +49 721 94246-0
Fon: +49 171 5438457
Fax: +49 721 94246-66
Web: http://www.bringe.de/

Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe
Ust.Id: DE812936645, HRB 108943 Mannheim 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2008-06-25  9:50 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-05  9:02 Weird nat/conntrack Problem with PASV FTP upload Thomas Bätzler
2008-06-05 13:59 ` Patrick McHardy
2008-06-06 13:56   ` Thomas Bätzler
2008-06-06 15:02     ` Patrick McHardy
2008-06-09  9:06       ` Jan Engelhardt
2008-06-09  9:09         ` Patrick McHardy
2008-06-09 12:38           ` Jan Engelhardt
  -- strict thread matches above, loose matches on Subject: below --
2008-06-09  8:58 Thomas Bätzler
2008-06-09  9:14 ` Jozsef Kadlecsik
2008-06-09 10:36 Thomas Bätzler
2008-06-09 11:21 ` Jozsef Kadlecsik
2008-06-09 12:35 Thomas Bätzler
2008-06-09 12:53 ` Jozsef Kadlecsik
2008-06-10  8:28 ` Jozsef Kadlecsik
2008-06-11  8:50   ` Thomas Bätzler
2008-06-23 10:49     ` Jozsef Kadlecsik
2008-06-23 13:46       ` Thomas Bätzler
2008-06-25  9:47       ` Thomas Bätzler
2008-06-25  9:50         ` Jozsef Kadlecsik
2008-06-23 12:50   ` Thomas Bätzler
2008-06-23 13:15     ` Jozsef Kadlecsik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox