* Re: [LARTC] Question re: multi-homed access
2002-03-21 17:50 [LARTC] Question re: multi-homed access Thomas Vander Stichele
@ 2002-03-21 21:35 ` Julian Anastasov
2002-03-21 23:02 ` Thomas Vander Stichele
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Julian Anastasov @ 2002-03-21 21:35 UTC (permalink / raw)
To: lartc
Hello,
On Thu, 21 Mar 2002, Thomas Vander Stichele wrote:
> Then I started writing the firewall script.
> I start by applying the iptables rules for statefulness (are these
> necessary ? exactly what do they do). I removed the interface
They are independent from the routing stuff.
> configuration commands, since that is handled by redhat.
> Then I remove the default route, and add the three tables which together
> implement the load balancing.
>
> For outgoing connections, this mostly works : I can tell from traceroutes
> that I get alternating outgoing gateways.
>
> Now for the problems I'm having :
>
> * before, when only using the ADSL as gateway, I could ssh to other boxes
> on the internet without problems. With the new setup, when I ssh to one
> of them (and the route goes over the second interface), the connection
> hangs at the moment ssh starts up the X port forwarding. I suppose this
Some users report for this problem with openssh where
the TOS changes in established state cause using a different
route (selected from the multipath scheduler). The new cached
route differs in the TOS field and so to a new gw/outdev. In
short, the problem is that the multipath scheduler when used
for NAT is used for all packets, not only at connection setup.
The details are explained in the docs.
> is because (IIRC) ssh tries to set up a connection from that box to my
> current machine, which somehow fails. If the route happens to go over the
> first interface, everything is ok.
>
> * When trying to access the firewall from the outside, connections only
> get established when coming in over the ADSL interface. When coming in
> over the cable interface, the connection hangs, indicating the route back
> is failing. This seems to me like another symptom of the same problem as
> the other.
May be, to summarize, the rule is: "the plain kernel
_only seems_ to work correctly for setups with NAT and multipath
routes".
> So here is a set of questions ;) You knew this was coming ...
>
> a) nano.txt only mentions outgoing connections. Does this document apply
> to incoming connections as well or not ? Should it work as outlined there,
Yes, the routing is considered symmetric.
> should I infer different iptables and ip rules to handle incoming traffic,
> or does it work in another way entirely ?
No, it will work only by using the routing rules.
> b) Since I don't have a default gateway and the gateway alternation works
> on outgoing routes, I suppose that my gateway setup is correct. So the
> fact that it cannot make incoming connections over eth2 is not due to eth1
> being the default gateway as was the case before.
> But what else could cause this behaviour ? Is it possible I might have my
> SNAT/MASQUERADE set up wrong to get this effect ?
>
> c) do I need to apply julian's patches in order for this basic setup
> (incoming traffic on both interfaces) to work ? It is my understanding
You should apply the patches if you expect the router
correctly to NAT the packets when using multipath route. I hope
the ssh problem will disappear because there the multipath scheduler
selects new route only for the first packet in each connection,
the established connections are considered bound to the masquerade
address for which usually we don't have multipath route.
> from browsing through the archive that, for this basic functionality, it's
> not necessary. I will of course apply these patches later on to have
> gateway failure detection, but my question is if applying these patches
> now or not will have any effect on my current setup.
I hope there will be effect
> Here is a list of output of various commands :
look good
> (I have to note here that using redhat's network configuration initialized
> the 192.168.254.0/24 to be "scope link" only, so no proto kernel and no
> src addresss. I thought that this might have been wrong so I changed it
> manually but it had no effect as far as I could tell)
In any case make sure the preferred source IP is valid
or autoselected to be such.
The routing rules look good, I didn't checked the iptables
> I hope this is enough information to help me debug the situation. Any
> help is MUCH appreciated.
>
> Thanks in advance,
> Thomas
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [LARTC] Question re: multi-homed access
2002-03-21 17:50 [LARTC] Question re: multi-homed access Thomas Vander Stichele
2002-03-21 21:35 ` Julian Anastasov
@ 2002-03-21 23:02 ` Thomas Vander Stichele
2002-03-21 23:55 ` Julian Anastasov
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Thomas Vander Stichele @ 2002-03-21 23:02 UTC (permalink / raw)
To: lartc
Hi,
thanks for your quick response ;)
> > Then I started writing the firewall script.
> > I start by applying the iptables rules for statefulness (are these
> > necessary ? exactly what do they do). I removed the interface
>
> They are independent from the routing stuff.
Ok, just out of curiosity, should I reread the docs to find out what they
do help for ? It was not really clear to me on the first couple of reads.
> > * before, when only using the ADSL as gateway, I could ssh to other boxes
> > on the internet without problems. With the new setup, when I ssh to one
> > of them (and the route goes over the second interface), the connection
> > hangs at the moment ssh starts up the X port forwarding. I suppose this
>
> Some users report for this problem with openssh where
> the TOS changes in established state cause using a different
> route (selected from the multipath scheduler). The new cached
> route differs in the TOS field and so to a new gw/outdev. In
> short, the problem is that the multipath scheduler when used
> for NAT is used for all packets, not only at connection setup.
> The details are explained in the docs.
OK, I'll read them again. But basically you're saying it's NOT because
the X port forwarding opens up a second connection ? Am I wrong in my
assumption about how ssh X port forwarding works, or is this not an issue
here ?
> > * When trying to access the firewall from the outside, connections only
> > get established when coming in over the ADSL interface. When coming in
> > over the cable interface, the connection hangs, indicating the route back
> > is failing. This seems to me like another symptom of the same problem as
> > the other.
>
> May be, to summarize, the rule is: "the plain kernel
> _only seems_ to work correctly for setups with NAT and multipath
> routes".
So you're saying it looks like it works right, but it doesn't ? Hm, ok.
So are the times when it doesn't work totally random, or is there some
logic in it ?
> You should apply the patches if you expect the router
> correctly to NAT the packets when using multipath route. I hope
> the ssh problem will disappear because there the multipath scheduler
> selects new route only for the first packet in each connection,
> the established connections are considered bound to the masquerade
> address for which usually we don't have multipath route.
Meanwhile, I recompiled my kernel with the patches and at first glance I
get the same behaviour. I will look into it some more tomorrow when I'm
back at work. What I wanted to ask re: masquerading is, if I add an
iptables rule to do masquerading without specifying a device, is that ok ?
or should I have one for each device specifically ?
The nano doc says that for fixed IP addresses you need an SNAT rule, while
for PPPoE devices you need MASQUERADE. Which should I be using for a DHCP
device ?
> > from browsing through the archive that, for this basic functionality, it's
> > not necessary. I will of course apply these patches later on to have
> > gateway failure detection, but my question is if applying these patches
> > now or not will have any effect on my current setup.
>
> I hope there will be effect
On first inspection, no. Is there some way I can debug incoming packets ?
What else can I give as feedback ?
Thanks again for your help, it is much appreciated.
Thomas
--
The Dave/Dina Project : future TV today ! - http://davedina.apestaart.org/
<-*- -*->
I'm alive.
I can tell because of the pain.
<-*- thomas@apestaart.org -*->
URGent, the best radio on the Internet - 24/7 ! - http://urgent.rug.ac.be/
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [LARTC] Question re: multi-homed access
2002-03-21 17:50 [LARTC] Question re: multi-homed access Thomas Vander Stichele
2002-03-21 21:35 ` Julian Anastasov
2002-03-21 23:02 ` Thomas Vander Stichele
@ 2002-03-21 23:55 ` Julian Anastasov
2002-03-22 9:58 ` Thomas Vander Stichele
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Julian Anastasov @ 2002-03-21 23:55 UTC (permalink / raw)
To: lartc
Hello,
On Fri, 22 Mar 2002, Thomas Vander Stichele wrote:
> > They are independent from the routing stuff.
>
> Ok, just out of curiosity, should I reread the docs to find out what they
:)
> do help for ? It was not really clear to me on the first couple of reads.
> OK, I'll read them again. But basically you're saying it's NOT because
> the X port forwarding opens up a second connection ? Am I wrong in my
> assumption about how ssh X port forwarding works, or is this not an issue
> here ?
No, there are issues where packets from one connection
can use different paths and that causes problems on NAT.
> > May be, to summarize, the rule is: "the plain kernel
> > _only seems_ to work correctly for setups with NAT and multipath
> > routes".
>
> So you're saying it looks like it works right, but it doesn't ? Hm, ok.
> So are the times when it doesn't work totally random, or is there some
> logic in it ?
Nothing is random :) The problem comes when the cached
route entry expires (/proc/sys/net/ipv4/route/gc_timeout) or the
cache is flushed as result of a user command such as adding/deleting
IP address or flushing the routes explicitly with "ip route cache flush".
The result is clear: the routing cache forgets the path and the NAT
code does not care.
> > You should apply the patches if you expect the router
> > correctly to NAT the packets when using multipath route. I hope
> > the ssh problem will disappear because there the multipath scheduler
> > selects new route only for the first packet in each connection,
> > the established connections are considered bound to the masquerade
> > address for which usually we don't have multipath route.
>
> Meanwhile, I recompiled my kernel with the patches and at first glance I
> get the same behaviour. I will look into it some more tomorrow when I'm
> back at work. What I wanted to ask re: masquerading is, if I add an
> iptables rule to do masquerading without specifying a device, is that ok ?
> or should I have one for each device specifically ?
If you use SNAT target then you must have one for
each device. The MASQUERADE target inherits the masquerade address
from the resolved route so specifying the output device is optional.
> The nano doc says that for fixed IP addresses you need an SNAT rule, while
> for PPPoE devices you need MASQUERADE. Which should I be using for a DHCP
> device ?
Not a must. The only differences between MASQUERADE and SNAT are:
- SNAT specifies the masquerade address, MASQUERADE uses the preferred
src ip from the route
- For MASQUERADE all masquerade connections are removed on device
down event (sometimes useful, sometimes not)
So, nothing can stop you to use MASQUERADE for static IPs
or SNAT for dynamic IPs, it depends when you create the NAT rules.
> > > from browsing through the archive that, for this basic functionality, it's
> > > not necessary. I will of course apply these patches later on to have
> > > gateway failure detection, but my question is if applying these patches
> > > now or not will have any effect on my current setup.
> >
> > I hope there will be effect
>
> On first inspection, no. Is there some way I can debug incoming packets ?
> What else can I give as feedback ?
You have to be able to analyze with tcpdump what is going on.
> Thomas
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [LARTC] Question re: multi-homed access
2002-03-21 17:50 [LARTC] Question re: multi-homed access Thomas Vander Stichele
` (2 preceding siblings ...)
2002-03-21 23:55 ` Julian Anastasov
@ 2002-03-22 9:58 ` Thomas Vander Stichele
2002-03-22 10:34 ` Julian Anastasov
2002-03-22 17:15 ` Thomas Vander Stichele
5 siblings, 0 replies; 7+ messages in thread
From: Thomas Vander Stichele @ 2002-03-22 9:58 UTC (permalink / raw)
To: lartc
Hello Julian,
> > So you're saying it looks like it works right, but it doesn't ? Hm, ok.
> > So are the times when it doesn't work totally random, or is there some
> > logic in it ?
>
> Nothing is random :) The problem comes when the cached
> route entry expires (/proc/sys/net/ipv4/route/gc_timeout) or the
> cache is flushed as result of a user command such as adding/deleting
> IP address or flushing the routes explicitly with "ip route cache flush".
> The result is clear: the routing cache forgets the path and the NAT
> code does not care.
Hm, OK. I checked the value on my system and it's set to 300, which IIRC
would mean 5 minutes (it's in seconds, right ?). The ssh connection set
up takes only about 2 seconds. Isn't it highly unlikely that the cache
would have a timeout in that interval, *reliably*, every time ?
> > > I hope there will be effect
> >
> > On first inspection, no. Is there some way I can debug incoming packets ?
> > What else can I give as feedback ?
>
> You have to be able to analyze with tcpdump what is going on.
OK, I used both iptraf and tcpdump to check what is going on, and I find
something very odd. Traffic does indeed go out on two interfaces, as
traceroutes and output of last on the boxes I ssh to show. But using both
tcpdump and iptraf, I only see non-tcp data going over eth2. None of the
tcp-connections show up on eth2. They do on eth1.
The only difference I can see is that ifconfig shows eth2 to be
"UP BROADCAST NOTRAILERS RUNNING", while eth1 is "UP BROADCAST RUNNING
MULTICAST". Other than that, the total received traffic is roughly the
same on both interfaces (which is weird, since I still cannot connect from
the outside over eth2), while the transmitted data for eth1 is 10 times
larger than for eth2. I'm not sure if this is because somehow traffic
going out over eth2 is not "registered" right (as seen by tcpdump and
iptraf) or because of something else. I'm not that experienced using
tcpdump, so I can't tell. Also, I don't know enough about the lesser-used
output from ifconfig since I never needed it before ;)
Is there something that could explain this weird behaviour ? Or are there
some hints or guides to use tcpdump correctly in debugging this particular
problem ?
All of this, btw, is done with my patched kernel including your patches
(just to make sure ;) )
Thanks in advance,
Thomas
--
The Dave/Dina Project : future TV today ! - http://davedina.apestaart.org/
<-*- -*->
"You don't need a girlfriend, you need a maid !"
"Isn't that the same thing ?"
"Uh uh, baby, you're in the wrong century."
<-*- thomas@apestaart.org -*->
URGent, the best radio on the Internet - 24/7 ! - http://urgent.rug.ac.be/
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [LARTC] Question re: multi-homed access
2002-03-21 17:50 [LARTC] Question re: multi-homed access Thomas Vander Stichele
` (3 preceding siblings ...)
2002-03-22 9:58 ` Thomas Vander Stichele
@ 2002-03-22 10:34 ` Julian Anastasov
2002-03-22 17:15 ` Thomas Vander Stichele
5 siblings, 0 replies; 7+ messages in thread
From: Julian Anastasov @ 2002-03-22 10:34 UTC (permalink / raw)
To: lartc
Hello,
On Fri, 22 Mar 2002, Thomas Vander Stichele wrote:
> Hm, OK. I checked the value on my system and it's set to 300, which IIRC
> would mean 5 minutes (it's in seconds, right ?). The ssh connection set
> up takes only about 2 seconds. Isn't it highly unlikely that the cache
> would have a timeout in that interval, *reliably*, every time ?
With the patched kernel it is 0 seconds: on the first packet.
With unpatched kernel the multipath route usage takes long, the
whole connection life.
> OK, I used both iptraf and tcpdump to check what is going on, and I find
> something very odd. Traffic does indeed go out on two interfaces, as
> traceroutes and output of last on the boxes I ssh to show. But using both
> tcpdump and iptraf, I only see non-tcp data going over eth2. None of the
> tcp-connections show up on eth2. They do on eth1.
I don't remember whether you have script that does active
gateway monitoring because the patched kernel does only passive
detection on route resolution, explained in nano.txt. I'm not
sure whether only the first alive nexthop from the multipath route
is used in your case. Check with "ip neigh" the status of the both
gateways, they should be in reachable state. May be I'll tune soon
the detection to treat more states as valid. Currently, only the
gateways in reachable state are considered as valid.
> The only difference I can see is that ifconfig shows eth2 to be
> "UP BROADCAST NOTRAILERS RUNNING", while eth1 is "UP BROADCAST RUNNING
> MULTICAST". Other than that, the total received traffic is roughly the
> same on both interfaces (which is weird, since I still cannot connect from
> the outside over eth2), while the transmitted data for eth1 is 10 times
> larger than for eth2. I'm not sure if this is because somehow traffic
> going out over eth2 is not "registered" right (as seen by tcpdump and
> iptraf) or because of something else. I'm not that experienced using
> tcpdump, so I can't tell. Also, I don't know enough about the lesser-used
> output from ifconfig since I never needed it before ;)
The old 2.4 kernels don't show correctly the IP addresses
for the NAT-ed packets in tcpdump (copy-on-write problem) but the
device should be correct. I'm not sure, may be 2.4.17 has these fixes.
> Is there something that could explain this weird behaviour ? Or are there
> some hints or guides to use tcpdump correctly in debugging this particular
> problem ?
It is enough to see the addresses. With a healthchecking
script you should not see problems.
> Thanks in advance,
> Thomas
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [LARTC] Question re: multi-homed access
2002-03-21 17:50 [LARTC] Question re: multi-homed access Thomas Vander Stichele
` (4 preceding siblings ...)
2002-03-22 10:34 ` Julian Anastasov
@ 2002-03-22 17:15 ` Thomas Vander Stichele
5 siblings, 0 replies; 7+ messages in thread
From: Thomas Vander Stichele @ 2002-03-22 17:15 UTC (permalink / raw)
To: lartc
Hi,
I feel stupid in having to admit this, but it seems my ISP is at fault.
They set up the wrong type of connection. We were supposed to have a
static IP address with all ports enabled. The installers only installed
the modem and handed me some docs, but they are for the "lite" version.
So my guess is that I'm on the standard DHCP which serves regular
customers, instead of using a static IP address (which they failed to
provide, so I have to call them for that). Also, my guess is they block
incoming traffic < 1024 (as do some other calbe operators).
I could get connections to work with higher port numbers on the second
interface.
So, sorry for trying your patience ;) I should've realized something was
amiss because of being on a DHCP server.
Thanks again for your kind help,
Thomas
> --
> Julian Anastasov <ja@ssi.bg>
>
> _______________________________________________
> LARTC mailing list / LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>
--
The Dave/Dina Project : future TV today ! - http://davedina.apestaart.org/
<-*- -*->
I was about to have my first sexual experience
and I wasn't even one of the players...
<-*- thomas@apestaart.org -*->
URGent, the best radio on the Internet - 24/7 ! - http://urgent.rug.ac.be/
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread