* strange routing issue--packets stop getting forwarded for a live connection
@ 2011-08-21 2:15 Corey Hickey
2011-08-21 6:35 ` Julian Anastasov
0 siblings, 1 reply; 5+ messages in thread
From: Corey Hickey @ 2011-08-21 2:15 UTC (permalink / raw)
To: Linux Netdev List
[-- Attachment #1: Type: text/plain, Size: 3053 bytes --]
Hi,
Please forgive me for asking a user question on a dev list; does the
linux-net list no longer exist? Majordomo wouldn't subscribe me and I
see no recent history in the archives. If there's a better place for
this question, please tell me. Anyway:
I have a strange issue where, reliably, certain conditions cause my
Linux router to stop forwarding packets for a connection.
----------------------------------------------------------------------
This is my setup:
client --> linux router --> vpn --> work desktop
198.18.0.3 198.18.0.1 (eth0) 192.168.10.88
192.168.6.230 (tun0)
All hosts are running Debian Sid with the stock Debian 3.0.0-1-amd64
kernel. tun0 is set up by openconnect (open-source client for cisco
anyconnnect), which has been historically reliable for me.
I noticed this problem happening when I replaced the router with a new
host. The old host was 32-bit, running Linux 2.6.38, and configured
identically (I think) with respect to routing and iptables. I didn't
have a problem then.
----------------------------------------------------------------------
I have seen this problem happen with http, sometimes, but the easiest
way to reproduce the issue every time is to use SSH with X11 forwarding
(I have no idea why). I can SSH, through my router and VPN connection,
to my desktop at work. I can log in, poke around, do whatever; as soon
as I run some particular X11 programs, the connection hangs. xlogo and
xeyes are fine, but rxvt and jconsole are not.
So, my baseline test is to run rxvt directly. This command always hangs:
$ ssh -X chickey@192.168.10.88 rxvt
I have run simultaneous tcpdumps on the router: one on eth0 and the
other on tun0. I see the tcp connection and ssh sessions get set up,
then many encrypted packets go back and forth. At a certain, reliably
reproducible point, a 1368 byte packet comes in on eth0 and does not
leave tun0; the retransmissions do not get forwarded either.
I have not been able to figure out the cause of this. Here's what I have
investigated:
1. Number of packets on the connection; doesn't seem to matter, because
I can use SSH for other purposes just fine.
2. Transmission rate; doesn't seem to matter, because I can do
$ ssh -X chickey@192.168.10.88 cat /dev/zero > /dev/null
3. MTU size; 1500 on eth0 and 1406 on tun0. Bigger packets have been
transferred fine.
4. VPN client bug; maybe, but I don't think so yet. I can do the same
thing if I SSH directly from the router. This is fine:
ssh -X 198.18.0.1 "ssh -X chickey@192.168.10.88 rxvt"
5. Connection tracking issue; conntrack shows no change in stage for the
connection when it hangs.
6. Some firewall rule. Stripping down my iptables setup to the minimum
does not help. I have also removed all qdiscs.
----------------------------------------------------------------------
Can anybody please suggest something else I should try here? This is
very confusing to me.
I am attaching a tarball of tcpdumps and other pertinent information.
Thank you,
Corey
[-- Attachment #2: problem.tar.bz2 --]
[-- Type: application/octet-stream, Size: 23175 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: strange routing issue--packets stop getting forwarded for a live connection
2011-08-21 2:15 strange routing issue--packets stop getting forwarded for a live connection Corey Hickey
@ 2011-08-21 6:35 ` Julian Anastasov
2011-08-21 8:12 ` Corey Hickey
0 siblings, 1 reply; 5+ messages in thread
From: Julian Anastasov @ 2011-08-21 6:35 UTC (permalink / raw)
To: Corey Hickey; +Cc: Linux Netdev List
Hello,
On Sat, 20 Aug 2011, Corey Hickey wrote:
> Hi,
>
> Please forgive me for asking a user question on a dev list; does the
> linux-net list no longer exist? Majordomo wouldn't subscribe me and I
> see no recent history in the archives. If there's a better place for
> this question, please tell me. Anyway:
>
> I have a strange issue where, reliably, certain conditions cause my
> Linux router to stop forwarding packets for a connection.
>
> ----------------------------------------------------------------------
>
> This is my setup:
>
> client --> linux router --> vpn --> work desktop
> 198.18.0.3 198.18.0.1 (eth0) 192.168.10.88
> 192.168.6.230 (tun0)
>
> All hosts are running Debian Sid with the stock Debian 3.0.0-1-amd64
> kernel. tun0 is set up by openconnect (open-source client for cisco
> anyconnnect), which has been historically reliable for me.
>
> I noticed this problem happening when I replaced the router with a new
> host. The old host was 32-bit, running Linux 2.6.38, and configured
> identically (I think) with respect to routing and iptables. I didn't
> have a problem then.
>
> ----------------------------------------------------------------------
>
> I have seen this problem happen with http, sometimes, but the easiest
> way to reproduce the issue every time is to use SSH with X11 forwarding
> (I have no idea why). I can SSH, through my router and VPN connection,
> to my desktop at work. I can log in, poke around, do whatever; as soon
> as I run some particular X11 programs, the connection hangs. xlogo and
> xeyes are fine, but rxvt and jconsole are not.
>
> So, my baseline test is to run rxvt directly. This command always hangs:
>
> $ ssh -X chickey@192.168.10.88 rxvt
>
> I have run simultaneous tcpdumps on the router: one on eth0 and the
> other on tun0. I see the tcp connection and ssh sessions get set up,
> then many encrypted packets go back and forth. At a certain, reliably
> reproducible point, a 1368 byte packet comes in on eth0 and does not
> leave tun0; the retransmissions do not get forwarded either.
>
> I have not been able to figure out the cause of this. Here's what I have
> investigated:
>
> 1. Number of packets on the connection; doesn't seem to matter, because
> I can use SSH for other purposes just fine.
>
> 2. Transmission rate; doesn't seem to matter, because I can do
> $ ssh -X chickey@192.168.10.88 cat /dev/zero > /dev/null
>
> 3. MTU size; 1500 on eth0 and 1406 on tun0. Bigger packets have been
> transferred fine.
Lower MTU, it can be PMTUD problem. At 04:50:24.112658
I see 7801:9169 is 1420 bytes and no ICMP FRAG NEEDED is generated.
May be these two regressions explain it:
http://marc.info/?l=linux-netdev&m=131342172722536&w=2
There are 2 fixes you can try or more recent kernel
tree, for example 3.1-rc2 has the fixes.
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: strange routing issue--packets stop getting forwarded for a live connection
2011-08-21 6:35 ` Julian Anastasov
@ 2011-08-21 8:12 ` Corey Hickey
2011-08-21 9:02 ` Julian Anastasov
0 siblings, 1 reply; 5+ messages in thread
From: Corey Hickey @ 2011-08-21 8:12 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Linux Netdev List
On 2011-08-20 23:35, Julian Anastasov wrote:
>> I have a strange issue where, reliably, certain conditions cause my
>> Linux router to stop forwarding packets for a connection.
[...]
>> 3. MTU size; 1500 on eth0 and 1406 on tun0. Bigger packets have been
>> transferred fine.
>
> Lower MTU, it can be PMTUD problem. At 04:50:24.112658
> I see 7801:9169 is 1420 bytes and no ICMP FRAG NEEDED is generated.
> May be these two regressions explain it:
>
> http://marc.info/?l=linux-netdev&m=131342172722536&w=2
>
> There are 2 fixes you can try or more recent kernel
> tree, for example 3.1-rc2 has the fixes.
Many thanks for your reply--it looks like you're on to something. You
didn't specify which interface to lower the MTU on, so I tried them each
in turn, and found that lowering the MTU on the client machine to 1406
(matching tun0 on the router) did indeed solve the problem. That makes
sense in retrospect.
It's a bit late at night for me to be patching my kernel, but I'll see
if I can do it tomorrow.
Thanks again, and I'll let you know how it turns out.
-Corey
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: strange routing issue--packets stop getting forwarded for a live connection
2011-08-21 8:12 ` Corey Hickey
@ 2011-08-21 9:02 ` Julian Anastasov
2011-08-26 3:50 ` Corey Hickey
0 siblings, 1 reply; 5+ messages in thread
From: Julian Anastasov @ 2011-08-21 9:02 UTC (permalink / raw)
To: Corey Hickey; +Cc: Linux Netdev List
Hello,
On Sun, 21 Aug 2011, Corey Hickey wrote:
> >> 3. MTU size; 1500 on eth0 and 1406 on tun0. Bigger packets have been
> >> transferred fine.
> >
> > Lower MTU, it can be PMTUD problem. At 04:50:24.112658
> > I see 7801:9169 is 1420 bytes and no ICMP FRAG NEEDED is generated.
> > May be these two regressions explain it:
> >
> > http://marc.info/?l=linux-netdev&m=131342172722536&w=2
> >
> > There are 2 fixes you can try or more recent kernel
> > tree, for example 3.1-rc2 has the fixes.
>
> Many thanks for your reply--it looks like you're on to something. You
> didn't specify which interface to lower the MTU on, so I tried them each
> in turn, and found that lowering the MTU on the client machine to 1406
> (matching tun0 on the router) did indeed solve the problem. That makes
> sense in retrospect.
I just wanted to note the difference in MTUs
as a possible cause that triggers the problem. And after
your confirmation I think the new/patched kernel should work
without playing with MTUs.
> It's a bit late at night for me to be patching my kernel, but I'll see
> if I can do it tomorrow.
>
> Thanks again, and I'll let you know how it turns out.
>
> -Corey
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: strange routing issue--packets stop getting forwarded for a live connection
2011-08-21 9:02 ` Julian Anastasov
@ 2011-08-26 3:50 ` Corey Hickey
0 siblings, 0 replies; 5+ messages in thread
From: Corey Hickey @ 2011-08-26 3:50 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Linux Netdev List
On 2011-08-21 02:02, Julian Anastasov wrote:
>>>> 3. MTU size; 1500 on eth0 and 1406 on tun0. Bigger packets have been
>>>> transferred fine.
>>>
>>> Lower MTU, it can be PMTUD problem. At 04:50:24.112658
>>> I see 7801:9169 is 1420 bytes and no ICMP FRAG NEEDED is generated.
>>> May be these two regressions explain it:
>>>
>>> http://marc.info/?l=linux-netdev&m=131342172722536&w=2
>>>
>>> There are 2 fixes you can try or more recent kernel
>>> tree, for example 3.1-rc2 has the fixes.
>>
>> Many thanks for your reply--it looks like you're on to something. You
>> didn't specify which interface to lower the MTU on, so I tried them each
>> in turn, and found that lowering the MTU on the client machine to 1406
>> (matching tun0 on the router) did indeed solve the problem. That makes
>> sense in retrospect.
>
> I just wanted to note the difference in MTUs
> as a possible cause that triggers the problem. And after
> your confirmation I think the new/patched kernel should work
> without playing with MTUs.
I finally got a chance to test this, and the patched kernel works fine.
Thank you!
-Corey
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-08-26 3:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-21 2:15 strange routing issue--packets stop getting forwarded for a live connection Corey Hickey
2011-08-21 6:35 ` Julian Anastasov
2011-08-21 8:12 ` Corey Hickey
2011-08-21 9:02 ` Julian Anastasov
2011-08-26 3:50 ` Corey Hickey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).