From: John Hughes <john@atlantech.com>
To: netdev@vger.kernel.org
Subject: When a TCP segment is split up (to be sent through a TUN device with a small MTU) who should recalculate the checksum?
Date: Fri, 15 Nov 2013 09:52:12 +0100 [thread overview]
Message-ID: <5285E0BC.4090402@atlantech.com> (raw)
I have two offices, joined by a OpenVPN tunnel. I've upgraded the
kernels in the machines running the tunnel to 3.10. All of a sudden I'm
getting horrible transmission delays between the two offices.
office1 LAN--------office1 tunnel machine
|
| openvpn tunnel
|
office2 tunnel machine------office2 LAN
What seems to be happening is that packets are arriving at the LAN
interface of the machine running the tunnel and being combined by
generic-receive-offload. These packets then have to be split up again
as they are too big for the tunnels MTU.
But when the packets are split the TCP checksum doesn't seem to be being
recalculated, so the systems on the other end of the tunnel ignore them,
forcing many retries and the observed delays.
For example, here is a large packet coming in on the NIC of the machine
running the tunnel, followed by a smaller packet ("caronia" is on the
office 1 LAN, "olympic" is on the office 2 LAN):
11:59:23.020426 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 3073:9843, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 6770
11:59:23.041072 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 9843:11197, ack 2233, win 148, options [nop,nop,TS val 215919297 ecr 1199882508], length 1354
Then the packet gets sent out on the tunnel as 5 smaller packets:
11:59:23.020449 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 3073:4427, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.020534 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 4427:5781, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.020536 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 5781:7135, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.020539 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 7135:8489, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.020543 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 8489:9843, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.041086 IP caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], seq 9843:11197, ack 2233, win 148, options [nop,nop,TS val 215919297 ecr 1199882508], length 1354
And this is what the receiving system sees:
11:59:23.025658 IP (tos 0x8, ttl 62, id 42831, offset 0, flags [DF], proto TCP (6), length 1406)
caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], cksum 0x1003 (incorrect -> 0xb1b9), seq 3073:4427, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.025907 IP (tos 0x8, ttl 62, id 42832, offset 0, flags [DF], proto TCP (6), length 1406)
caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], cksum 0x1003 (incorrect -> 0x871c), seq 4427:5781, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.025990 IP (tos 0x8, ttl 62, id 42833, offset 0, flags [DF], proto TCP (6), length 1406)
caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], cksum 0x1003 (incorrect -> 0x97dd), seq 5781:7135, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.026183 IP (tos 0x8, ttl 62, id 42834, offset 0, flags [DF], proto TCP (6), length 1406)
caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], cksum 0x1003 (incorrect -> 0x9961), seq 7135:8489, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.026231 IP (tos 0x8, ttl 62, id 42835, offset 0, flags [DF], proto TCP (6), length 1406)
caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], cksum 0x1003 (incorrect -> 0x6a2a), seq 8489:9843, ack 2233, win 148, options [nop,nop,TS val 215919291 ecr 1199882508], length 1354
11:59:23.046163 IP (tos 0x8, ttl 62, id 42836, offset 0, flags [DF], proto TCP (6), length 1406)
caronia.CalvaEDI.COM.33232 > olympic.calvaedi.com.ssh: Flags [.], cksum 0xd237 (correct), seq 9843:11197, ack 2233, win 148, options [nop,nop,TS val 215919297 ecr 1199882508], length 1354
The receiving system is, of course, unhappy about that and complains
that it hasn't got 3073:9843
11:59:23.045040 IP olympic.calvaedi.com.ssh > caronia.CalvaEDI.COM.33232: Flags [.], ack 3073, win 1933, options [nop,nop,TS val 1199882514 ecr 215919290,nop,nop,sack 1 {9843:11197}], length 0
So, when the 6770 byte segment is split up into five 1354 byte segments
who is supposed to recalculate the checksums?
(This is Debian bug 729567).
--
John Hughes, SARL Atlantic Technologies.
next reply other threads:[~2013-11-15 9:06 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-15 8:52 John Hughes [this message]
2013-11-15 13:06 ` When a TCP segment is split up (to be sent through a TUN device with a small MTU) who should recalculate the checksum? Eric Dumazet
2013-11-15 14:02 ` John Hughes
2013-11-15 14:20 ` Vlad Yasevich
2013-11-15 14:31 ` John Hughes
2013-11-15 14:41 ` Vlad Yasevich
2013-11-15 14:55 ` John Hughes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5285E0BC.4090402@atlantech.com \
--to=john@atlantech.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.