Do you know the TCP stack? (127.x.x.x routing)

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-06  2:20 Zdenek Radouch
  2005-03-06  9:56 ` Martin Mares
  0 siblings, 1 reply; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-06  2:20 UTC (permalink / raw)
  To: netdev, linux-net

How can I disable the stack processing for the 127 net?
Can someone estimate the amount of work needed to do that,
and/or point me to the relevant piece of code?

That is, I'd like to treat the 127 net the same as all other network
numbers. Since that is not the case in the current stack,
I want to remove or disable whatever processing there is.
I am using 2.4 kernel.

Here is a long version and a rationale.
I need a truly private network on a device that serves as a router
in someone else's network.  (The device itself has an internal network).
As far as I know, there is no provision for this within the existing network
numbering scheme.  Obviously, the architects of the current  numbering
scheme did not think one could build a router with more than a single
card. Unfortunately, routers are being built today with intelligent
line cards, and there is nothing simpler that the IP/socket based IPC
between Ethernet-connected cards.  The problem is immediately obvious:
one can't use any legal address for the internal network, since it may
collide with an external network the device is handling.  And since
the device can be routing non-Internet addresses, the "reserved"
numbers are as unusable as the normal ones. The only solution I've seen
on routers running BSD stack is to subnet the 127 net, and use one
of the subnets for the internal network.

Unfortunately, this does not work with the Linux stack, because the
127 net is treated (for good reasons I suppose) as a special net.
What I need is to remove whatever special processing there is,
so that the net can be treated as any other net.  Then I could, for
example, attach 127.0.0.1/16 to the "lo" device, and 127.1.0.0/16
would be my internal net, thus keeping the standard 127.0.0.1
address for the localhost, and having a truly private internal network.

So, that's all fine, except for the fact that I am not familiar with the
Linux stack code.  I do need this done, so as a first step I'd like
to get a feeling for the scope of the required modification and
an estimated effort to do this.  As with my previous problem,
if it turns out that this is a non-trivial effort,  I will gladly arrange
a short-term contract for someone in order to be adequately
compensated for the work.

Thanks.
-Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06  2:20 Zdenek Radouch
@ 2005-03-06  9:56 ` Martin Mares
  2005-03-06 17:01   ` Zdenek Radouch
  0 siblings, 1 reply; 52+ messages in thread
From: Martin Mares @ 2005-03-06  9:56 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: netdev, linux-net

Hello!

> Unfortunately, this does not work with the Linux stack, because the
> 127 net is treated (for good reasons I suppose) as a special net.

Is it really?

I've just tried

	ip addr del 127.0.0.1/8 dev lo
	ip addr add 127.0.0.1/24 dev lo

and `ping 127.1.2.3' is then happily sent along the default route.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Only dead fish swim with the stream.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06  9:56 ` Martin Mares
@ 2005-03-06 17:01   ` Zdenek Radouch
  2005-03-06 17:12     ` alex
  2005-03-06 17:31     ` Thomas Graf
  0 siblings, 2 replies; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-06 17:01 UTC (permalink / raw)
  To: Martin Mares; +Cc: netdev, linux-net

At 10:56 AM 3/6/05 +0100, Martin Mares wrote:
>Hello!
>
>> Unfortunately, this does not work with the Linux stack, because the
>> 127 net is treated (for good reasons I suppose) as a special net.
>
>Is it really?

Well, at least it looks that way to me:

    svfx:~# netstat -rn
    Kernel IP routing table
    Destination     Gateway         Genmask         Flags   MSS Window
irtt Iface
    192.168.13.0    0.0.0.0         255.255.255.0   U         0 0
0 eth0
    0.0.0.0         192.168.13.254  0.0.0.0         UG        0 0
0 eth0
    svfx:~# 

Hmmm, I don't seem to have the loopback interface...
This table implies that 127.0.0.1 should go out via eth0
to a gateway 192.168.13.264.  That's hard to believe.

    svfx:~# ping -c 1 127.0.0.1
    PING 127.0.0.1 (127.0.0.1): 56 data bytes
    64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.2 ms

    --- 127.0.0.1 ping statistics ---
    1 packets transmitted, 1 packets received, 0% packet loss
    round-trip min/avg/max = 0.2/0.2/0.2 ms

Looks like I do have the loopback interface after all. It just
seems to be hidden, i.e., it actually is treated in a special way
by one of the entities I am perusing.
Let's see if I can delete the route anyway.

    svfx:~# route del -net 127.0.0.0 netmask 255.0.0.0 dev lo
    SIOCDELRT: No such process
    svfx:~# 

Looks like I can't, maybe it's not there?

>I've just tried
>
>	ip addr del 127.0.0.1/8 dev lo
>	ip addr add 127.0.0.1/24 dev lo
>
>and `ping 127.1.2.3' is then happily sent along the default route.
>

I don't have iproute around, so I will install it now.
...
and try your method:

    svfx:~# ip addr del 127.0.0.1/8 dev lo
    Cannot send dump request: Connection refused
    svfx:~# 

That actually looks like some compatibility issue if I had to guess.
I never used the iproute tools, so I'll ignore that for now.
[Anyone knows what this means?]

Something just crossed my mind - maybe the 127 processing
and/or the netstat/route/iproute tools are in flux, i.e., being
changed in a major way to the point that I really need to pay
attention to what kernel I am running.  I have done the above tests
on my "stable" machine, which runs 2.2.20 (common Debian stable
release). I'll go and retest everything on my embedded target
which is running the 2.4.25 kernel.

Can someone comment on the stability of the tools in question
or any implementation changes in this area that would explain
the above behavior?

But point well taken, perhaps I just need a bit more imagination
when I'm testing these things.  It may very well work, it just may
look like it does not.  Thanks for the suggestions!

Regards,
-Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 17:01   ` Zdenek Radouch
@ 2005-03-06 17:12     ` alex
  2005-03-06 17:31     ` Thomas Graf
  1 sibling, 0 replies; 52+ messages in thread
From: alex @ 2005-03-06 17:12 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: Martin Mares, netdev, linux-net

On Sun, 6 Mar 2005, Zdenek Radouch wrote:

> seems to be hidden, i.e., it actually is treated in a special way
> by one of the entities I am perusing.
> Let's see if I can delete the route anyway.
> 
>     svfx:~# route del -net 127.0.0.0 netmask 255.0.0.0 dev lo
>     SIOCDELRT: No such process
>     svfx:~# 
> 
> That actually looks like some compatibility issue if I had to guess.
> I never used the iproute tools, so I'll ignore that for now.
> [Anyone knows what this means?]
> and/or the netstat/route/iproute tools are in flux, i.e., being changed
> in a major way to the point that I really need to pay attention to what
> kernel I am running.  I have done the above tests on my "stable"
> machine, which runs 2.2.20 (common Debian stable release). I'll go and
> retest everything on my embedded target which is running the 2.4.25
> kernel.
That's like testing on a yugo. Make sure after upgrading to 2.4, you also 
get iproute2 toolchain.

> Can someone comment on the stability of the tools in question
> or any implementation changes in this area that would explain
> the above behavior?
On 2.4.27, once you delete 127.x address from the interface, traffic will 
go as expected to another route...


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 17:01   ` Zdenek Radouch
  2005-03-06 17:12     ` alex
@ 2005-03-06 17:31     ` Thomas Graf
  2005-03-06 19:48       ` Zdenek Radouch
  1 sibling, 1 reply; 52+ messages in thread
From: Thomas Graf @ 2005-03-06 17:31 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: Martin Mares, netdev, linux-net

> Hmmm, I don't seem to have the loopback interface...
> This table implies that 127.0.0.1 should go out via eth0
> to a gateway 192.168.13.264.  That's hard to believe.

It's in the local table

tgr:axs ~ ip route list dev lo table local
broadcast 127.255.255.255  proto kernel  scope link  src 127.0.0.1 
broadcast 127.0.0.0  proto kernel  scope link  src 127.0.0.1 
local 127.0.0.1  proto kernel  scope host  src 127.0.0.1 
local 127.0.0.0/8  proto kernel  scope host  src 127.0.0.1 
tgr:axs ~ ip route del 127.0.0.1 dev lo table local
tgr:axs ~ ip route del 127.0.0.0/8 dev lo table local
tgr:axs ~ ip route get 127.0.0.1
127.0.0.1 via 192.168.23.13 dev eth0  src 192.168.23.1 
    cache  mtu 1500 advmss 1460 metric10 64

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 17:31     ` Thomas Graf
@ 2005-03-06 19:48       ` Zdenek Radouch
  2005-03-06 20:19         ` alex
  2005-03-06 20:19         ` Andi Kleen
  0 siblings, 2 replies; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-06 19:48 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Martin Mares, netdev, linux-net

OK, I think I am getting the picture.

1) looks like what I need may be possible, at least as far as
    some kernels are concerned.  It's not clear that 2.4.25 will work.

2) I only have to perform close to magic in locating the "right"
    tools that happen to work on a "right" kernel release.

3) Clearly the route processing is in flux, at least within the
    releases I am dealing with, so I need to be careful interpreting
    what I see, and I should avoid making any inferences.

There is no doubt that the 127.x net is treated in a special
way.  If I have to believe what I just learned, then the 127
routes are in a "local" table, a table on which the "route"
utility by definition does not operate!  On the 2.4.25 machine
I cannot get any of the "ip" commands to execute without
an error:

  $ ip route del 127.0.0.1 dev lo table local
  ip: either "to" is duplicate, or "table" is a garbage.

Since there was no "to" on the command line I suspect
the busybox crap to be doing something very bad.
I'll look at that.

To summarize, it appears that I can subnet the 127 net
by appropriately manipulating one or two kernel routing tables,
if I can find the right tools to do that.  If the tools don't work, then
getting the tools to work would be the necessary modifications
I would have to make on my machines to get the job done.

I'd like to thank everyone for their help.

Regards,
-Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 19:48       ` Zdenek Radouch
@ 2005-03-06 20:19         ` alex
  2005-03-06 20:19         ` Andi Kleen
  1 sibling, 0 replies; 52+ messages in thread
From: alex @ 2005-03-06 20:19 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: netdev, linux-net

On Sun, 6 Mar 2005, Zdenek Radouch wrote:

> 
> 1) looks like what I need may be possible, at least as far as
>     some kernels are concerned.  It's not clear that 2.4.25 will work.
It is clear it will.

> 2) I only have to perform close to magic in locating the "right"
>     tools that happen to work on a "right" kernel release.
Not really. Recent (as in, in past 3 years) tools and recent (as in, in 
past 3 years) kernel.
> 
> 3) Clearly the route processing is in flux, at least within the
>     releases I am dealing with, so I need to be careful interpreting
>     what I see, and I should avoid making any inferences.
No, not really.
> 
> There is no doubt that the 127.x net is treated in a special way.  If I
> have to believe what I just learned, then the 127 routes are in a
> "local" table, a table on which the "route" utility by definition does
> not operate!  On the 2.4.25 machine I cannot get any of the "ip"
> commands to execute without an error:
'Route' utility is by definition deprecated.

>   $ ip route del 127.0.0.1 dev lo table local
>   ip: either "to" is duplicate, or "table" is a garbage.
[root@bawx2 ~]# ip route del 127.0.0.1 dev lo table local
[root@bawx2 ~]#

And don't forget to delete the /8 route as well.

> Since there was no "to" on the command line I suspect the busybox crap
> to be doing something very bad. I'll look at that.
Don't try to use broken tools (busyboxed iproute2). Test with known-good
iproute2.

> To summarize, it appears that I can subnet the 127 net by appropriately
> manipulating one or two kernel routing tables, if I can find the right
> tools to do that.  If the tools don't work, then getting the tools to
> work would be the necessary modifications I would have to make on my
> machines to get the job done.
-alex


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 19:48       ` Zdenek Radouch
  2005-03-06 20:19         ` alex
@ 2005-03-06 20:19         ` Andi Kleen
  2005-03-06 20:45           ` Thomas Graf
  1 sibling, 1 reply; 52+ messages in thread
From: Andi Kleen @ 2005-03-06 20:19 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: Martin Mares, netdev, linux-net

Zdenek Radouch <zdenek@rcn.com> writes:

> OK, I think I am getting the picture.
>
> 1) looks like what I need may be possible, at least as far as
>     some kernels are concerned.  It's not clear that 2.4.25 will work.
>
> 2) I only have to perform close to magic in locating the "right"
>     tools that happen to work on a "right" kernel release.

iproute2 has been the tool of choice since Linux 2.2.

ifconfig/route and the old ioctl interface have been only
there for compatibility and show only a small subset of 
the full functionality.

That has been true for many many years.

>
> 3) Clearly the route processing is in flux, at least within the
>     releases I am dealing with, so I need to be careful interpreting
>     what I see, and I should avoid making any inferences.

I don't think that's true. Routing hasn't changed much for a long time.

>
> There is no doubt that the 127.x net is treated in a special
> way.  If I have to believe what I just learned, then the 127

It is. 127.* is hardcoded in the routing engine and e.g.
it won't accept outside packets with a loopback address.

Most likely it's enough to change the "LOOPBACK" macro to allow
parts of the Class A to be used for other purposes.

-Andi 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 20:19         ` Andi Kleen
@ 2005-03-06 20:45           ` Thomas Graf
  2005-03-06 21:30             ` Andi Kleen
  2005-03-06 21:50             ` Zdenek Radouch
  0 siblings, 2 replies; 52+ messages in thread
From: Thomas Graf @ 2005-03-06 20:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Zdenek Radouch, Martin Mares, netdev, linux-net

* Andi Kleen <m1y8d0mss2.fsf@muc.de> 2005-03-06 21:19
> Zdenek Radouch <zdenek@rcn.com> writes:
> >
> > There is no doubt that the 127.x net is treated in a special
> > way.  If I have to believe what I just learned, then the 127
> 
> It is. 127.* is hardcoded in the routing engine and e.g.
> it won't accept outside packets with a loopback address.
> 
> Most likely it's enough to change the "LOOPBACK" macro to allow
> parts of the Class A to be used for other purposes.

Yes, it will work around the martian route and arp checks but
will probably break quite a few usersapce applications.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 20:45           ` Thomas Graf
@ 2005-03-06 21:30             ` Andi Kleen
  2005-03-06 21:50               ` Thomas Graf
  2005-03-06 21:50             ` Zdenek Radouch
  1 sibling, 1 reply; 52+ messages in thread
From: Andi Kleen @ 2005-03-06 21:30 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Zdenek Radouch, Martin Mares, netdev, linux-net

On Sun, Mar 06, 2005 at 09:45:16PM +0100, Thomas Graf wrote:
> * Andi Kleen <m1y8d0mss2.fsf@muc.de> 2005-03-06 21:19
> > Zdenek Radouch <zdenek@rcn.com> writes:
> > >
> > > There is no doubt that the 127.x net is treated in a special
> > > way.  If I have to believe what I just learned, then the 127
> > 
> > It is. 127.* is hardcoded in the routing engine and e.g.
> > it won't accept outside packets with a loopback address.
> > 
> > Most likely it's enough to change the "LOOPBACK" macro to allow
> > parts of the Class A to be used for other purposes.
> 
> Yes, it will work around the martian route and arp checks but
> will probably break quite a few usersapce applications.

Unlikely. glibc has an own LOOPBACK() and all modern distributions
use separate kernel/user headers anyways.

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 21:30             ` Andi Kleen
@ 2005-03-06 21:50               ` Thomas Graf
  0 siblings, 0 replies; 52+ messages in thread
From: Thomas Graf @ 2005-03-06 21:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Zdenek Radouch, Martin Mares, netdev, linux-net

* Andi Kleen <20050306213047.GA65970@muc.de> 2005-03-06 22:30
> On Sun, Mar 06, 2005 at 09:45:16PM +0100, Thomas Graf wrote:
> > Yes, it will work around the martian route and arp checks but
> > will probably break quite a few usersapce applications.
> 
> Unlikely. glibc has an own LOOPBACK() and all modern distributions
> use separate kernel/user headers anyways.

I was rather referring to the reduced loopback scope. I'm aware of
at least 3 applications that make extensive use of big portions of
the scope to multiplex streams and they depend on LOOPBACK() to make
sure the addresses they use will be looped back.

I agree that userspace has its own LOOPBACK macro in most cases but
this is exactly the problem, it may result in userspace assuming
certain addreses to be regarded as loopback by the kernel when they
won't. This of course heavily depends on how the LOOPBACK macro is
changed. I just wanted to point out that it may affect userspace
under certain circumstances.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 20:45           ` Thomas Graf
  2005-03-06 21:30             ` Andi Kleen
@ 2005-03-06 21:50             ` Zdenek Radouch
  2005-03-07  7:01               ` Sumit Pandya
  2005-03-07  8:05               ` Eran Mann
  1 sibling, 2 replies; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-06 21:50 UTC (permalink / raw)
  To: Thomas Graf, Andi Kleen; +Cc: Martin Mares, netdev, linux-net

OK.  We've gone a full circle, [except for a few digressions
along the lines of me not knowing that while the rest of the
world still uses 'route', under linux it has long been deprecated]
you seem to be agreeing with my original guess that 
subnetting the 127 net may not be trivial, and that it may require
some kernel hacking.

So my original questions still stand:

1) How could one remove the special kernel treatment of the 127 net?
    [so that "lo" gets 127.0.0.1/16 and "foo" gets 127.1.0.1/16, and
    so that the "foo" interface can actually receive packets?

2) If it does require kernel hacking, would you like to do it for me?
    (as I had said, as a contract)


>> it won't accept outside packets with a loopback address.

Not accepting packets with with a loopback address is one
thing, not accepting any 127.0.0.0/8 packets is entirely something else.

Couldn't that whole 127 thing be ripped out of the kernel?
Why couldn't the "lo" interface be treated as any other interface?

-Zdenek





At 09:45 PM 3/6/05 +0100, Thomas Graf wrote:
>* Andi Kleen <m1y8d0mss2.fsf@muc.de> 2005-03-06 21:19
>> Zdenek Radouch <zdenek@rcn.com> writes:
>> >
>> > There is no doubt that the 127.x net is treated in a special
>> > way.  If I have to believe what I just learned, then the 127
>> 
>> It is. 127.* is hardcoded in the routing engine and e.g.
>> it won't accept outside packets with a loopback address.
>> 
>> Most likely it's enough to change the "LOOPBACK" macro to allow
>> parts of the Class A to be used for other purposes.
>
>Yes, it will work around the martian route and arp checks but
>will probably break quite a few usersapce applications.
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 21:50             ` Zdenek Radouch
@ 2005-03-07  7:01               ` Sumit Pandya
  2005-03-07  8:05               ` Eran Mann
  1 sibling, 0 replies; 52+ messages in thread
From: Sumit Pandya @ 2005-03-07  7:01 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: netdev, linux-net

Zdenek,
	I don't know how much help you can get with "dummy" interface. Try to set
your requirement with that special interface into mind.
-- Sumit

> -----Original Message-----
> From: linux-net-owner@vger.kernel.org
> [mailto:linux-net-owner@vger.kernel.org]On Behalf Of Zdenek Radouch
> Sent: Monday, March 07, 2005 3:21 AM

>
> OK.  We've gone a full circle, [except for a few digressions
> along the lines of me not knowing that while the rest of the
> world still uses 'route', under linux it has long been deprecated]
> you seem to be agreeing with my original guess that
> subnetting the 127 net may not be trivial, and that it may require
> some kernel hacking.
>
> So my original questions still stand:
>
> 1) How could one remove the special kernel treatment of the 127 net?
>     [so that "lo" gets 127.0.0.1/16 and "foo" gets 127.1.0.1/16, and
>     so that the "foo" interface can actually receive packets?
>
> 2) If it does require kernel hacking, would you like to do it for me?
>     (as I had said, as a contract)
>
>
> >> it won't accept outside packets with a loopback address.
>
> Not accepting packets with with a loopback address is one
> thing, not accepting any 127.0.0.0/8 packets is entirely something else.
>
> Couldn't that whole 127 thing be ripped out of the kernel?
> Why couldn't the "lo" interface be treated as any other interface?


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-06 21:50             ` Zdenek Radouch
  2005-03-07  7:01               ` Sumit Pandya
@ 2005-03-07  8:05               ` Eran Mann
  2005-03-07 12:14                 ` jamal
  2005-03-07 23:50                 ` jamal
  1 sibling, 2 replies; 52+ messages in thread
From: Eran Mann @ 2005-03-07  8:05 UTC (permalink / raw)
  To: Zdenek Radouch; +Cc: Thomas Graf, Andi Kleen, Martin Mares, netdev, linux-net

[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]

Zdenek Radouch wrote:
...
> 
> 2) If it does require kernel hacking, would you like to do it for me?
>     (as I had said, as a contract)
I think what Andi Kleen was talking about below is something like the 
attached 5 minutes patch (applies cleanly to 2.4.2x kernels I have at 
hand, and to 2.6.11 with minor offset). Please donate the 5 minute wages 
to the OSDL or the FSF at your choice ;-)
...
> 
> Not accepting packets with with a loopback address is one
> thing, not accepting any 127.0.0.0/8 packets is entirely something else.

Yes, however it seems to be required by the RFC (quoting RFC 3330 
"special use IPv4 addresses") :

"  127.0.0.0/8 - This block is assigned for use as the Internet host
    loopback address.  A datagram sent by a higher level protocol to an
    address anywhere within this block should loop back inside the host.
    This is ordinarily implemented using only 127.0.0.1/32 for loopback,
    but no addresses within this block should ever appear on any network
    anywhere [RFC1700, page 5]. "

>>* Andi Kleen <m1y8d0mss2.fsf@muc.de> 2005-03-06 21:19
>>
...
>>>
>>>It is. 127.* is hardcoded in the routing engine and e.g.
>>>it won't accept outside packets with a loopback address.
>>>
>>>Most likely it's enough to change the "LOOPBACK" macro to allow
>>>parts of the Class A to be used for other purposes.
...
-- 
Eran Mann
MRV International

[-- Attachment #2: lo_hack.patch --]
[-- Type: text/x-patch, Size: 969 bytes --]

--- 2.4.27/include/linux/in.h	2004-05-28 17:15:37.000000000 +0300
+++ 2.4.27.hacked/include/linux/in.h	2005-03-07 09:53:02.000000000 +0200
@@ -226,7 +226,7 @@
 
 /* Address to loopback in software to local host.  */
 #define	INADDR_LOOPBACK		0x7f000001	/* 127.0.0.1   */
-#define	IN_LOOPBACK(a)		((((long int) (a)) & 0xff000000) == 0x7f000000)
+#define	IN_LOOPBACK(a)		((((long int) (a)) & 0xffff0000) == 0x7f000000)
 
 /* Defines for Multicast INADDR */
 #define INADDR_UNSPEC_GROUP   	0xe0000000U	/* 224.0.0.0   */
@@ -240,7 +240,7 @@
 
 #ifdef __KERNEL__
 /* Some random defines to make it easier in the kernel.. */
-#define LOOPBACK(x)	(((x) & htonl(0xff000000)) == htonl(0x7f000000))
+#define LOOPBACK(x)	(((x) & htonl(0xffff0000)) == htonl(0x7f000000))
 #define MULTICAST(x)	(((x) & htonl(0xf0000000)) == htonl(0xe0000000))
 #define BADCLASS(x)	(((x) & htonl(0xf0000000)) == htonl(0xf0000000))
 #define ZERONET(x)	(((x) & htonl(0xff000000)) == htonl(0x00000000))

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-07  8:05               ` Eran Mann
@ 2005-03-07 12:14                 ` jamal
  2005-03-07 23:50                 ` jamal
  1 sibling, 0 replies; 52+ messages in thread
From: jamal @ 2005-03-07 12:14 UTC (permalink / raw)
  To: Eran Mann
  Cc: Zdenek Radouch, Thomas Graf, Andi Kleen, Martin Mares, netdev,
	linux-net

On Mon, 2005-03-07 at 03:05, Eran Mann wrote:
> Zdenek Radouch wrote:
> ...
> > 
> > 2) If it does require kernel hacking, would you like to do it for me?
> >     (as I had said, as a contract)
> I think what Andi Kleen was talking about below is something like the 
> attached 5 minutes patch (applies cleanly to 2.4.2x kernels I have at 
> hand, and to 2.6.11 with minor offset). Please donate the 5 minute wages 
> to the OSDL or the FSF at your choice ;-)

That should do it. Or you can even return false in the macro always for
his case - since he will never have a lo device.

However, using these addresses is a BAD BAD idea. A lot of other
machines will be expecting 127.x to mean something speacial. I dont
think you should ask the poster for wages, he will suffer enough with
ARPs etc ;-> 

What is so wrong with RFC198 addresses??

cheers,
jamal


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-07  8:05               ` Eran Mann
  2005-03-07 12:14                 ` jamal
@ 2005-03-07 23:50                 ` jamal
  2005-03-08  3:15                   ` Zdenek Radouch
  1 sibling, 1 reply; 52+ messages in thread
From: jamal @ 2005-03-07 23:50 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: Eran Mann, Zdenek Radouch, Thomas Graf, Andi Kleen, Martin Mares,
	netdev, linux-net

BTW, please cc netdev or myself if you are addressing me. This email was
just forwarde by someone else to me - I am not on linux-net. You seem to
have trimmed down the CC list.

On Mon, 2005-03-07 at 18:02:18, Steve Iribarne wrote:
>-> 
>-> What is so wrong with RFC198 addresses??
>-> 

>Really RFC1918 you mean...

Indeed 1918

>Well if your product is placed behind a nat'd network, MOST if not ALL
> nat'd network addresses on the "inside" use the RFC1918 address space.  

I read this a few times and still didnt get it:
Why is it that people using 1918 addresses are affecting you?
Does using 127.x help you because you assume _nobody_ else would be using
127.x addresses? 
I am assuming you want this address for some internal network whereas the 
external contains some routable addresses?

> So I have this working in my products now.  I had to do something a bit
> different in that I want a "special" 127.xx.xx.xx range to be sent out
> on the wire.  So here is what I did.

[..]

Seems you did too much. Look at the 2 liner patch posted by Eran Mann
(which should work on 2.4 and 2.6 as well).

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-07 23:50                 ` jamal
@ 2005-03-08  3:15                   ` Zdenek Radouch
  2005-03-08 13:34                     ` jamal
  2005-03-08 14:02                     ` Thomas Graf
  0 siblings, 2 replies; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-08  3:15 UTC (permalink / raw)
  To: hadi, Steve Iribarne
  Cc: Eran Mann, Thomas Graf, Andi Kleen, Martin Mares, netdev,
	linux-net

At 06:50 PM 3/7/05 -0500, jamal wrote:
>BTW, please cc netdev or myself if you are addressing me. This email was
>just forwarde by someone else to me - I am not on linux-net. You seem to
>have trimmed down the CC list.
>
>On Mon, 2005-03-07 at 18:02:18, Steve Iribarne wrote:
>>-> 
>>-> What is so wrong with RFC198 addresses??
>>-> 
>
>>Really RFC1918 you mean...
>
>Indeed 1918
>
>>Well if your product is placed behind a nat'd network, MOST if not ALL
>> nat'd network addresses on the "inside" use the RFC1918 address space.  
>
>I read this a few times and still didnt get it:
>Why is it that people using 1918 addresses are affecting you?

RFC 1918 trivializes the IP addressing by boxing
all hosts into either a "private" or "public" category,
based on their need to access the Internet.

The major thing the RFC misses is the fact that internal
to one of these "public" or "private" hosts, you may have
another, "even more private" network, for example one
that connects the cards within the chassis.  Such network
must be (for obvious reasons) completely hidden
from the outside, and thus cannot come from the
"outside" address space.  This "outside" space is a union
of the "public" and "private" IP addresses.
Guess what's left?  How 'bout 127.0.0.0.

-Z

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08  3:15                   ` Zdenek Radouch
@ 2005-03-08 13:34                     ` jamal
  2005-03-08 13:51                       ` Martin Mares
                                         ` (2 more replies)
  2005-03-08 14:02                     ` Thomas Graf
  1 sibling, 3 replies; 52+ messages in thread
From: jamal @ 2005-03-08 13:34 UTC (permalink / raw)
  To: Zdenek Radouch
  Cc: Steve Iribarne, Eran Mann, Thomas Graf, Andi Kleen, Martin Mares,
	netdev, linux-net

PS:- anyone not copying me in the responses while addressing me - i
didnt see your response.

On Mon, 2005-03-07 at 22:15, Zdenek Radouch wrote:

> RFC 1918 trivializes the IP addressing by boxing
> all hosts into either a "private" or "public" category,
> based on their need to access the Internet.
> 

sure. And the semantics are: dont route "private" addresses 
if they stray on the "public network". In other words, it is left to the
network setup to resolve this.

> The major thing the RFC misses is the fact that internal
> to one of these "public" or "private" hosts, you may have
> another, "even more private" network, for example one
> that connects the cards within the chassis.  

But why is this more "even more private"?
Surely you can use 10.x addresses just fine within a chasis.
Just make sure the packets dont leak out (if thats what you so desire).
i.e set your routing properly.
Nothing makes 127.x addresses not usable in NATs or not be routable
once you start attching them to non-hostlocal interfaces. 

> Such network
> must be (for obvious reasons) completely hidden
> from the outside, and thus cannot come from the
> "outside" address space.  This "outside" space is a union
> of the "public" and "private" IP addresses.
> Guess what's left?  How 'bout 127.0.0.0.
> 

Lets see, your requirements are:
a) packets within a chasis subnet shall stay within a chasis subnet
b) the outside (of the chasis) world shall never discover whats inside 
the chasis (example ARPs will fail to resolve etc)

Did i miss anything else?

Seems to me you are relying on obscurity of 127.x to achieve goals which
you could achieve just as easily with a 10.x address or even a public
address. Is this correct? In otherwords it doesnt matter what addresses
you use for internal chassis. What matters is how you set the route
tables etc.
I respect your desire to use whatever address range, but show me one
think i couldnt do with a 10.x in the chasis that you can now achieve
with a 127.x .. I think this will bring some clarity for me.

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 13:34                     ` jamal
@ 2005-03-08 13:51                       ` Martin Mares
  2005-03-08 13:58                         ` jamal
  2005-03-08 18:34                       ` Henrik Nordstrom
  2005-03-09  5:33                       ` Zdenek Radouch
  2 siblings, 1 reply; 52+ messages in thread
From: Martin Mares @ 2005-03-08 13:51 UTC (permalink / raw)
  To: jamal
  Cc: Zdenek Radouch, Steve Iribarne, Eran Mann, Thomas Graf,
	Andi Kleen, netdev, linux-net

Hello!

> Seems to me you are relying on obscurity of 127.x to achieve goals which
> you could achieve just as easily with a 10.x address or even a public
> address.

Using the same public block in all devices looks like the best solution.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
"Object orientation is in the mind, not in the compiler." -- Alan Cox

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 13:51                       ` Martin Mares
@ 2005-03-08 13:58                         ` jamal
  2005-03-08 14:03                           ` Martin Mares
  0 siblings, 1 reply; 52+ messages in thread
From: jamal @ 2005-03-08 13:58 UTC (permalink / raw)
  To: Martin Mares
  Cc: Zdenek Radouch, Steve Iribarne, Eran Mann, Thomas Graf,
	Andi Kleen, netdev, linux-net

On Tue, 2005-03-08 at 08:51, Martin Mares wrote:
> Hello!
> 
> > Seems to me you are relying on obscurity of 127.x to achieve goals which
> > you could achieve just as easily with a 10.x address or even a public
> > address.
> 
> Using the same public block in all devices looks like the best solution.
> 

People tend to use private blocks in a chasis (unique IP for each blade)
- which by default are not routed outside the chasis.

I have a feeling this is what you meant

cheers,
jamal


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08  3:15                   ` Zdenek Radouch
  2005-03-08 13:34                     ` jamal
@ 2005-03-08 14:02                     ` Thomas Graf
  1 sibling, 0 replies; 52+ messages in thread
From: Thomas Graf @ 2005-03-08 14:02 UTC (permalink / raw)
  To: Zdenek Radouch
  Cc: hadi, Steve Iribarne, Eran Mann, Andi Kleen, Martin Mares, netdev,
	linux-net

* Zdenek Radouch <3sp35g$7rsc1@smtp04.mrf.mail.rcn.net> 2005-03-07 22:15
> The major thing the RFC misses is the fact that internal
> to one of these "public" or "private" hosts, you may have
> another, "even more private" network, for example one
> that connects the cards within the chassis.  Such network
> must be (for obvious reasons) completely hidden
> from the outside, and thus cannot come from the
> "outside" address space.  This "outside" space is a union
> of the "public" and "private" IP addresses.
> Guess what's left?  How 'bout 127.0.0.0.

RFC 1918 is in no way related to 127/8, it simply suggest various
address spaces considered private and the fact that its status
is only best practice makes it obvious that it has open issues
such as merging conflicts so I'm not quite sure if I understand
what you mean.

I think we all agree that having 127/8 fully routeable in the local
table would be a good thing although I haven't seen any use for it.

There are two major problems involved though:

  The kernel must know about its own local address for ARP, routing and
  various other reasons. This isn't a problem because it could simply
  look up the route but sometimes there is not enough information to do
  a full route lookup. This issue can be resolved with some effort though.
  It would get easier if policy routing is ignored for this purpose.

  Userspace must be told about the address and prefix of the loopback
  which is done via the LOOPBACK() macro. Extracting parts of the
  address field is not a problem if userspace is recompiled but making
  it dynamically is. It would mean to change all userspace applications
  relying on LOOPBACK() to either use netlink or ioctl. Given this
  issue has been resolved there it is still likely that certain
  userspace applications do not use LOOPBACK() and simply rely on the
  fact that 127/8 has a host scope and is _always_ looped back.

Problem #2 can probably be ignored in some cases and left to the
operator to resolve.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 13:58                         ` jamal
@ 2005-03-08 14:03                           ` Martin Mares
  2005-03-08 14:17                             ` jamal
  0 siblings, 1 reply; 52+ messages in thread
From: Martin Mares @ 2005-03-08 14:03 UTC (permalink / raw)
  To: jamal
  Cc: Zdenek Radouch, Steve Iribarne, Eran Mann, Thomas Graf,
	Andi Kleen, netdev, linux-net

Hello!

> People tend to use private blocks in a chasis (unique IP for each blade)
> - which by default are not routed outside the chasis.
> 
> I have a feeling this is what you meant

No, since this tends to interfere with the outside network using the
same private (RFC 1918) addresses. People generally don't expect network
equipment to collide with their perfectly legal addressing plan.

On the other hand, if the manufacturer gets a small block of public
addresses and uses it in all his devices (the same block everywhere)
for internal purposes only (no packet ever escapes), everything is
perfectly correct and no collisions can arise.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
"Computers are useless.  They can only give you answers."  -- Pablo Picasso

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 14:03                           ` Martin Mares
@ 2005-03-08 14:17                             ` jamal
  2005-03-08 14:20                               ` Martin Mares
  2005-03-08 18:40                               ` Henrik Nordstrom
  0 siblings, 2 replies; 52+ messages in thread
From: jamal @ 2005-03-08 14:17 UTC (permalink / raw)
  To: Martin Mares
  Cc: Zdenek Radouch, Steve Iribarne, Eran Mann, Thomas Graf,
	Andi Kleen, netdev, linux-net

On Tue, 2005-03-08 at 09:03, Martin Mares wrote:

> No, since this tends to interfere with the outside network using the
> same private (RFC 1918) addresses. People generally don't expect network
> equipment to collide with their perfectly legal addressing plan.
> 

Aha! Thanks for clarifying this. So the problem domain is:  "IP address
conflict" detection and somehow this is seen as a resolution to that
problem. 
So what happens when you put tow or three of Zdenek's boxes in one
location? Back to square 1?

> On the other hand, if the manufacturer gets a small block of public
> addresses and uses it in all his devices (the same block everywhere)
> for internal purposes only (no packet ever escapes), everything is
> perfectly correct and no collisions can arise.

Yes, I see.
Except this wont be practical for IPV4 since those addresses are scarce.
May make sense for V6 though (becomes like MAC addresses on NICS).

cheers,
jamal


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 14:17                             ` jamal
@ 2005-03-08 14:20                               ` Martin Mares
  2005-03-08 18:40                               ` Henrik Nordstrom
  1 sibling, 0 replies; 52+ messages in thread
From: Martin Mares @ 2005-03-08 14:20 UTC (permalink / raw)
  To: jamal
  Cc: Zdenek Radouch, Steve Iribarne, Eran Mann, Thomas Graf,
	Andi Kleen, netdev, linux-net

Hello!

> So what happens when you put tow or three of Zdenek's boxes in one
> location? Back to square 1?

No, if I understood Zdenek correctly, he wants to use the addresses
only internally inside the box, so multiple boxes should happily
co-exist. OTOH if the same address is used anywhere in the neighboring
network, it's going to break.

> Except this wont be practical for IPV4 since those addresses are scarce.

If the addresses are going to be used only internally, it suffices to
allocate only a small block of addresses and use this block for all
devices.

> May make sense for V6 though (becomes like MAC addresses on NICS).

Sure.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Mr. Worf, scan that ship."  "Aye, Captain... 600 DPI?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-08 15:07 Steve Iribarne
  0 siblings, 0 replies; 52+ messages in thread
From: Steve Iribarne @ 2005-03-08 15:07 UTC (permalink / raw)
  To: hadi
  Cc: Eran Mann, Zdenek Radouch, Thomas Graf, Andi Kleen, Martin Mares,
	netdev, linux-net

-> BTW, please cc netdev or myself if you are addressing me. This email
was
-> just forwarde by someone else to me - I am not on linux-net. You seem
to
-> have trimmed down the CC list.
-> 

You should join the list and the quit when you are done.  Otherwise,
like with this email I get multiple copies of it.

-> I read this a few times and still didnt get it:
-> Why is it that people using 1918 addresses are affecting you?
-> Does using 127.x help you because you assume _nobody_ else would be
using
-> 127.x addresses?

I am in a chassis.  I need a way to do interface card communication.
Even if those cards are exposed to the outside world.  

-> I am assuming you want this address for some internal network whereas
the
-> external contains some routable addresses?
->

Yep.

-> > So I have this working in my products now.  I had to do something a
bit
-> > different in that I want a "special" 127.xx.xx.xx range to be sent
out
-> > on the wire.  So here is what I did.
-> 
-> [..]
-> 
-> Seems you did too much. Look at the 2 liner patch posted by Eran Mann

Right. That works too.  But what I did was about 10 lines of code.  And
I refined it a bit better I believe.  That way packets destined for "my"
internal network got out the appropriate interface.  The rest go on
their merry way to the loopback world.




^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 13:34                     ` jamal
  2005-03-08 13:51                       ` Martin Mares
@ 2005-03-08 18:34                       ` Henrik Nordstrom
  2005-03-09  5:33                       ` Zdenek Radouch
  2 siblings, 0 replies; 52+ messages in thread
From: Henrik Nordstrom @ 2005-03-08 18:34 UTC (permalink / raw)
  To: jamal
  Cc: Zdenek Radouch, Steve Iribarne, Eran Mann, Thomas Graf,
	Andi Kleen, Martin Mares, netdev, linux-net

On Tue, 8 Mar 2005, jamal wrote:

> Lets see, your requirements are:
> a) packets within a chasis subnet shall stay within a chasis subnet
> b) the outside (of the chasis) world shall never discover whats inside
> the chasis (example ARPs will fail to resolve etc)

> Did i miss anything else?

Yes.

c) The chassis components must interact properly with other external 
equipment using public or RFC1918 addresses.

So while yes, it may be possible to use RFC1918 addresses, but only if the 
network administrator connecting the chassi first enters a RFC1918 network 
he has reserved for "equipment internal" use into the chassis 
configuration to ensure the internal addressing within the chassis does 
not conflict with the addressing used on his private network. This even if 
these addresses is never seen anywhere outside of the chassis (except 
possibly diagnostics channels into the chassis).

Regards
Henrik

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 14:17                             ` jamal
  2005-03-08 14:20                               ` Martin Mares
@ 2005-03-08 18:40                               ` Henrik Nordstrom
  2005-03-08 21:17                                 ` jamal
  1 sibling, 1 reply; 52+ messages in thread
From: Henrik Nordstrom @ 2005-03-08 18:40 UTC (permalink / raw)
  To: jamal
  Cc: Martin Mares, Zdenek Radouch, Steve Iribarne, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Tue, 8 Mar 2005, jamal wrote:

> Aha! Thanks for clarifying this. So the problem domain is:  "IP address
> conflict" detection and somehow this is seen as a resolution to that
> problem.
> So what happens when you put tow or three of Zdenek's boxes in one
> location? Back to square 1?

Not if the 127.X addresses never leaves the Zdenek's boxes, when thinking 
in terms that each set of boxes communicating using 127.X addresses is a 
single chassis, seen as a single box to the network admin.

> Except this wont be practical for IPV4 since those addresses are scarce.
> May make sense for V6 though (becomes like MAC addresses on NICS).

IPv6 already have link local addressing IIRC.

Regards
Henrik

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 18:40                               ` Henrik Nordstrom
@ 2005-03-08 21:17                                 ` jamal
  2005-03-09  9:09                                   ` Henrik Nordstrom
  0 siblings, 1 reply; 52+ messages in thread
From: jamal @ 2005-03-08 21:17 UTC (permalink / raw)
  To: Henrik Nordstrom
  Cc: Martin Mares, Zdenek Radouch, Steve Iribarne, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Tue, 2005-03-08 at 13:40, Henrik Nordstrom wrote:
[..]
> Not if the 127.X addresses never leaves the Zdenek's boxes, when thinking 
> in terms that each set of boxes communicating using 127.X addresses is a 
> single chassis, seen as a single box to the network admin.
> 

Henrik, so what is the difference between this and using any random
block of addresses?;-> If the packets never leave the box i can use
IBM's block of addresses if i wanted - no need to sweat this far (with
hacking the kernel). 
If Zdenek is going to put more than one box then theres nothing magical;
he will have to sit down and configure one of the boxes manually - no
escape there.
If he puts only a single box then he may likely get away with it.

> > Except this wont be practical for IPV4 since those addresses are scarce.
> > May make sense for V6 though (becomes like MAC addresses on NICS).
> 
> IPv6 already have link local addressing IIRC.
> 

indeed that is what is needed in this case if the problem is address
conflict resolution. An equivalent for v4 (called zeroconf) is at:
http://www.zeroconf.org/
It is unfortunate though because Apple has been claiming it has
patented this v4 linklocal scheme - and if i recall the person who wrote
the Linux code eventually took it off their web page (cant even seem to
find the web page anymore).

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 13:34                     ` jamal
  2005-03-08 13:51                       ` Martin Mares
  2005-03-08 18:34                       ` Henrik Nordstrom
@ 2005-03-09  5:33                       ` Zdenek Radouch
  2 siblings, 0 replies; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-09  5:33 UTC (permalink / raw)
  To: hadi; +Cc: Eran Mann, Thomas Graf, Andi Kleen, Martin Mares, netdev,
	linux-net

At 08:34 AM 3/8/05 -0500, jamal wrote:
>PS:- anyone not copying me in the responses while addressing me - i
>didnt see your response.
>
>On Mon, 2005-03-07 at 22:15, Zdenek Radouch wrote:
>
>> RFC 1918 trivializes the IP addressing by boxing
>> all hosts into either a "private" or "public" category,
>> based on their need to access the Internet.
>> 
>
>sure. And the semantics are: dont route "private" addresses 
>if they stray on the "public network". In other words, it is left to the
>network setup to resolve this.
>
>> The major thing the RFC misses is the fact that internal
>> to one of these "public" or "private" hosts, you may have
>> another, "even more private" network, for example one
>> that connects the cards within the chassis.  
>
>But why is this more "even more private"?

Because the hosting device may be sitting on a "private" net,
with which you don't want to interfere.

>Surely you can use 10.x addresses just fine within a chasis.

Not if the admin's SNMP/CLI client machine  lives on a 10.x net.

>Nothing makes 127.x addresses not usable in NATs or not be routable
>once you start attching them to non-hostlocal interfaces. 

That's true (if I got the multiple negatives right ;-))
But what's the point you're trying to make?

>
>> Such network
>> must be (for obvious reasons) completely hidden
>> from the outside, and thus cannot come from the
>> "outside" address space.  This "outside" space is a union
>> of the "public" and "private" IP addresses.
>> Guess what's left?  How 'bout 127.0.0.0.
>> 
>
>Lets see, your requirements are:
>a) packets within a chasis subnet shall stay within a chasis subnet
>b) the outside (of the chasis) world shall never discover whats inside 
>the chasis (example ARPs will fail to resolve etc)
>
>Did i miss anything else?

Yes, a fundamental point.  The "outside" of the chassis is your
customer's network. The only thing you know about that
network is that it is *not* 127.x.  Consequently, if you don't
want to interfere with the outside you must use 127.x.

>
>Seems to me you are relying on obscurity of 127.x

As I said previously and shown above, 127 is the only one left,
it has not been randomly selected.

> to achieve goals which
>you could achieve just as easily with a 10.x address or even a public
>address. Is this correct?

No. 

> In otherwords it doesnt matter what addresses
>you use for internal chassis. What matters is how you set the route
>tables etc.
>I respect your desire to use whatever address range, but show me one
>think i couldnt do with a 10.x in the chasis that you can now achieve
>with a 127.x .. I think this will bring some clarity for me.
>

You couldn't walk in the NOC and tell them: "You can't use the 10.x
net to manage your equipment - my box is already using that net".

As a few people already pointed out, subnetting the 127 net
is a common practice if you are making multi-card communication
equipment, especially routers.  

Often, these systems must be able to communicate with the
external world, either as "public" hosts, or as "private", i.e.,
NAT'd hosts.  Because of this, the internal networks may not
ever have either public or RFC 1918 addresses.
For the same reason, the internal network cannot ever be
"configurable", since the configured address/net would
become inaccessible on the outside (it would be routed
to the internal network). Note that this has nothing to
do with the fact that the 127 address "never leaves the box".

Hope this clarifies the issue.

-Z

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-08 21:17                                 ` jamal
@ 2005-03-09  9:09                                   ` Henrik Nordstrom
  2005-03-09 12:39                                     ` jamal
  0 siblings, 1 reply; 52+ messages in thread
From: Henrik Nordstrom @ 2005-03-09  9:09 UTC (permalink / raw)
  To: jamal
  Cc: Martin Mares, Zdenek Radouch, Steve Iribarne, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Tue, 8 Mar 2005, jamal wrote:

> Henrik, so what is the difference between this and using any random
> block of addresses?;-> If the packets never leave the box i can use
> IBM's block of addresses if i wanted - no need to sweat this far (with
> hacking the kernel).

Not if you want to maintain sane routing tables within the box and still 
be able for IBM to connect the box to their network. Some components of 
the box will need to sit both in the external and internal environments.

> If Zdenek is going to put more than one box then theres nothing magical;
> he will have to sit down and configure one of the boxes manually - no
> escape there.

No, as the packets never leaves his box in the first place there is no 
problem with multiple boxes. They will never share the internal network 
segment where the addresses are seen.

He is building a multi-node box (single box, multiple internal nodes, some 
external intefaces) using TCP/IP for the internal communication between 
the nodes within the box. For this communication he propose the use of a 
part of the 127/8 address space, but only for the communication within his 
multinode box. Not for communication visible outside of the box.

> If he puts only a single box then he may likely get away with it.
>
>>> Except this wont be practical for IPV4 since those addresses are scarce.
>>> May make sense for V6 though (becomes like MAC addresses on NICS).
>>
>> IPv6 already have link local addressing IIRC.
>>
>
> indeed that is what is needed in this case if the problem is address
> conflict resolution. An equivalent for v4 (called zeroconf) is at:
> http://www.zeroconf.org/

Unfortunately this does not apply to multihomed hosts, but provides an 
interesting address range which may be useable as an alternative to 127.X 
for the discussed purpose assuming hosts on the local network outside of 
the box is not using IPv4LL addresses in communication involving the box.

Regards
Henrik

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09  9:09                                   ` Henrik Nordstrom
@ 2005-03-09 12:39                                     ` jamal
  2005-03-09 13:39                                       ` Zdenek Radouch
  2005-03-09 22:34                                       ` Henrik Nordstrom
  0 siblings, 2 replies; 52+ messages in thread
From: jamal @ 2005-03-09 12:39 UTC (permalink / raw)
  To: Henrik Nordstrom
  Cc: Martin Mares, Zdenek Radouch, Steve Iribarne, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

Zdenek - This includes a response to your email as well.

On Wed, 2005-03-09 at 04:09, Henrik Nordstrom wrote:
> On Tue, 8 Mar 2005, jamal wrote:
> 
> > Henrik, so what is the difference between this and using any random
> > block of addresses?;-> If the packets never leave the box i can use
> > IBM's block of addresses if i wanted - no need to sweat this far (with
> > hacking the kernel).
> 
> Not if you want to maintain sane routing tables within the box and still 
> be able for IBM to connect the box to their network. Some components of 
> the box will need to sit both in the external and internal environments.
> 

For the record i have built or helped build many many such boxes... 

I am afraid this 127.x panacea is begining to sound like the tale of
some insane emperor who was naked but people around him sucking up to
him telling him how fine his clothes looked. I am having a very hard
time seeing the rationale - infact its driving me nuts, so please bear
with me.

Lets list the options and assume there are two sets of addresses those
for inside the chasis and those for outside:

1) Addresses for intra-chasis communication.
The addresses used by the blades are intrachasis relevant only and the
packets never leave the box. The blades are interconnected via some
L2/VLAN/bridge within the chasis. 

Conclusion:
If these packets never leave the box - no ARP will ever see them and no
dynamic routing protocol will ever advertise them - therefore no IP
address collision. You can use _whatever_ address you want, private
public, IBMs, intels etc. Do we agree on this? In other words hack not
needed here.

2) The addresses for chasis-outside world communication. You have one or
more dedicated gateways to connect between the outside of the chasis to
inside.
There are many tricks you could use to somehow get the packets to/from
the internal blades: NAT, forward, have aliases inside the chasis which
get forwarded etc. Lets not discuss about how the the packets finaly
make it outside, rather just assume these packets make it outside the
chasis then lets explore using either 127.x or RFC1918 addresses.

a) using private addresses implies possibility of conflict of addresses
within customer's  network. To quote Zdenek: 
You couldn't walk in the NOC and tell them: "You can't use the 10.x
net to manage your equipment - my box is already using that net".
Conclusion:
You walk into the NOC and say "can i use 10.0.0.x/22 subnet" they say "no
thats going to collide use 10.0.0.0/28"
Summary: You may need to go to your box and reconfigure its external looking
addresses.

a') Using 127.x addresses. You -> NOC "can i use 127.0.0.x/22 subnet" 
they say either "sorry, our routers cant route 127.x" or "no Zdenek 
was here before you, thats going to collide use 127.0.0.0/28"

Same conclusion as 2a)

Do you see the problem? I dont see the difference between 2a) and 2a')
I also dont see the reason you need 127.x for 1) since you could have 
used any address for the intra-chasis (I have seen people use many differrent
addresses). 

So tell me what i am missing!

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 12:39                                     ` jamal
@ 2005-03-09 13:39                                       ` Zdenek Radouch
  2005-03-09 14:18                                         ` jamal
  2005-03-09 22:34                                       ` Henrik Nordstrom
  1 sibling, 1 reply; 52+ messages in thread
From: Zdenek Radouch @ 2005-03-09 13:39 UTC (permalink / raw)
  To: hadi, Henrik Nordstrom
  Cc: Martin Mares, Eran Mann, Thomas Graf, Andi Kleen, netdev,
	linux-net

At 07:39 AM 3/9/05 -0500, jamal wrote:
>...
>Lets list the options and assume there are two sets of addresses those
>for inside the chasis and those for outside:
>
>1) Addresses for intra-chasis communication.
>The addresses used by the blades are intrachasis relevant only and the
>packets never leave the box. The blades are interconnected via some
>L2/VLAN/bridge within the chasis. 
>
>Conclusion:
>If these packets never leave the box - no ARP will ever see them and no
>dynamic routing protocol will ever advertise them - therefore no IP
>address collision. You can use _whatever_ address you want, private
>public, IBMs, intels etc. Do we agree on this? In other words hack not
>needed here.

No, I do not agree - you really need to re-read my last post carefully,
making sure you understand what I am saying.  Other people have
illustrated the problem as well.

Imagine a simple gateway, connecting two parts of your company - the east
interface connects to a corporate net with a default gateway, the west net
is the software dept. net.  Now imagine that you give your internal line card
in this simple gateway a "_whatever_" address, say 18.7.22.69.
Your gateway now has a route 18.7.22.69/32 -> dev linecard
Now please tell me what happens when a guy on the west net tries
to check his MIT evening class schedule.

>a) using private addresses implies possibility of conflict of addresses
>within customer's  network. To quote Zdenek: 
>You couldn't walk in the NOC and tell them: "You can't use the 10.x
>net to manage your equipment - my box is already using that net".
>Conclusion:
>You walk into the NOC and say "can i use 10.0.0.x/22 subnet" they say "no
>thats going to collide use 10.0.0.0/28"

In real world, where you pay for addresses and for people's time, no one
will give you *their* address for *your* interconnect. Not a public address,
and not a RFC1918 address.  Your interconnect is your problem,
they are neither interested nor paid to deal with your design issues.

>a') Using 127.x addresses. You -> NOC "can i use 127.0.0.x/22 subnet"

I know I can use it.  I own it as per RFC 3330.

>So tell me what i am missing!

Experience of having built a router.  Sorry to be so blunt.

-Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 13:39                                       ` Zdenek Radouch
@ 2005-03-09 14:18                                         ` jamal
  2005-03-09 16:46                                           ` Jason Lunz
  2005-03-09 17:52                                           ` Matt Mackall
  0 siblings, 2 replies; 52+ messages in thread
From: jamal @ 2005-03-09 14:18 UTC (permalink / raw)
  To: Zdenek Radouch
  Cc: Henrik Nordstrom, Martin Mares, Eran Mann, Thomas Graf,
	Andi Kleen, netdev, linux-net

On Wed, 2005-03-09 at 08:39, Zdenek Radouch wrote:
> At 07:39 AM 3/9/05 -0500, jamal wrote:

[..]
> Imagine a simple gateway, connecting two parts of your company
>  - the east
> interface connects to a corporate net with a default gateway, the west net
> is the software dept. net.  Now imagine that you give your internal line card
> in this simple gateway a "_whatever_" address, say 18.7.22.69.
> Your gateway now has a route 18.7.22.69/32 -> dev linecard
> Now please tell me what happens when a guy on the west net tries
> to check his MIT evening class schedule.

Are we still talking about the same problem? The linecards addresses and
interconnect interfaces are "internal". They are never advertised/seen
outside of the chasis. So if you choose 18.7.22.69/32 to use internally
you make sure it is never advertised to the outside world as belonging
to you. If you have to advertise it or actually know it is used, then
you must deal with the conflict.
Of course, there are "externally" visible addresses which are seen
outside the chasis; How you end up connecting internal 
inter-line card is your problem - lets say there are more than one ways
and infact you may never even need to use IP.

> >a) using private addresses implies possibility of conflict of addresses
> >within customer's  network. To quote Zdenek: 
> >You couldn't walk in the NOC and tell them: "You can't use the 10.x
> >net to manage your equipment - my box is already using that net".
> >Conclusion:
> >You walk into the NOC and say "can i use 10.0.0.x/22 subnet" they say "no
> >thats going to collide use 10.0.0.0/28"
> 
> In real world, where you pay for addresses and for people's time, no one
> will give you *their* address for *your* interconnect. Not a public address,
> and not a RFC1918 address.  Your interconnect is your problem,
> they are neither interested nor paid to deal with your design issues.
> 

I dont think i was saying anything different. 

> 
> >a') Using 127.x addresses. You -> NOC "can i use 127.0.0.x/22 subnet"
> 
> I know I can use it.  I own it as per RFC 3330.
> 

RFC 3300 does not give you any rights to use it the way you are.

To quote RFC 3300:
--
A datagram sent by a higher level protocol to an
   address anywhere within this block should loop back inside the host.
---

> >So tell me what i am missing!
> 
> Experience of having built a router.  Sorry to be so blunt.
> 

Youd be the first person to ever accuse me of that.  Lets say
I have never needed this hack; the key is to be able to clearly
demarcate what are _internal and external interfaces as well as what are
internal and external IP addresses_. Yes, you can do it with Linux.
What you seem to be counting on is you being the only person ever
using the hack. In other words, survival via obscurity.
If the router upstream from you used the same hack you end up being in
trouble.

You seem to be getting angry, it may be time to end the discussion. 

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-09 15:01 Steve Iribarne
  2005-03-09 16:00 ` jamal
  2005-03-10  6:48 ` Catalin(ux aka Dino) BOIE
  0 siblings, 2 replies; 52+ messages in thread
From: Steve Iribarne @ 2005-03-09 15:01 UTC (permalink / raw)
  To: hadi, Henrik Nordstrom
  Cc: Martin Mares, Zdenek Radouch, Eran Mann, Thomas Graf, Andi Kleen,
	netdev, linux-net

First off, apologies for the all the cc's on this.  I hate doing it, but
I will only do it for this post!

-> 1) Addresses for intra-chasis communication.
-> The addresses used by the blades are intrachasis relevant only and
the
-> packets never leave the box. The blades are interconnected via some
-> L2/VLAN/bridge within the chasis.
-> 

Big assumption here.  The VLAN/Bridge/Router that I have in my chassis
is hooked up to a switch.  The switch will NOT send the packets on my
mgmt VLAN out over the network.  

(see below for more details on this.. in the "what am I missing" section
)

-> Conclusion:
-> If these packets never leave the box - no ARP will ever see them and
no
-> dynamic routing protocol will ever advertise them - therefore no IP
-> address collision. You can use _whatever_ address you want, private
-> public, IBMs, intels etc. Do we agree on this? In other words hack
not
-> needed here.

Wrong.  Packets need to leave each blade.  You cannot treat the blades
as a private entity.  You must ARP to find out the other blades MAC
address.

-> 
-> 2) The addresses for chasis-outside world communication. You have one
or
-> more dedicated gateways to connect between the outside of the chasis
to
-> inside.
-> There are many tricks you could use to somehow get the packets
to/from
-> the internal blades: NAT, forward, have aliases inside the chasis
which
-> get forwarded etc. Lets not discuss about how the the packets finaly
-> make it outside, rather just assume these packets make it outside the
-> chasis then lets explore using either 127.x or RFC1918 addresses.
-> 
-> a) using private addresses implies possibility of conflict of
addresses
-> within customer's  network. To quote Zdenek:
-> You couldn't walk in the NOC and tell them: "You can't use the 10.x
-> net to manage your equipment - my box is already using that net".
-> Conclusion:
-> You walk into the NOC and say "can i use 10.0.0.x/22 subnet" they say
"no
-> thats going to collide use 10.0.0.0/28"
-> Summary: You may need to go to your box and reconfigure its external
-> looking
-> addresses.
-> 

I _use_ to do exactly what you stated above.  When RFC 1918 first came
out I used the 10 net.  

1st bug:  Customer had the same 10.100.xx.xx/24 net that I had and my
inter-system communication wouldn't work, because all my routes got
screwed up.  (i.e the SNMP sub-agents couldn't talk with the master). 

1st response to bug:  Well can you use another network address range?
Customer response:  Hell no.

Solution to bug1:  Easy, let the user configure the mgmt network ip
address.
Customers answer to bug1 solution:  Get the hell out of here; you don't
do out-of-band mgmt.  Do you know what a security risk this is for me?
Blah blah blah....  Even though all inter-chassis communication was done
securely, I couldn't convince them. I had a customer boot me out of his
office and boot our company out **because** of my design.  Not a good
feeling.

-> a') Using 127.x addresses. You -> NOC "can i use 127.0.0.x/22 subnet"
-> they say either "sorry, our routers cant route 127.x" or "no Zdenek
-> was here before you, thats going to collide use 127.0.0.0/28"
-> 

This is __EXACTLY__ the behavior we want.  I want routers to drop those
packets.  My inter-chassis communication better NOT go through a router.

-> So tell me what i am missing!
-> 

Experience.  You are missing a big key factor.  The routing part of what
you are saying is sound.  The big thing you are not getting is how the
"applications", telnet, snmp, ssh, Linux-HA, etc.. will interact with
your system.  You do NOT want to rewrite those applications to have some
knowledge of your system.  They want to connect to an IP address and
that better work (off-the-shelf).  

Therefore,

As a kernel programmer, it's easier for me to make sure the 127.xx net
works and sends the 127.xx packets to the proper network.

In conclusion:

It seems that Zdenek and I have been down this road _many_ times.  I
have shipped over 10 different routers/chassis systems.  I speak from
experience and experience alone.  I don't claim to be the smartest
person in the world, but I know what works.

This post started with a simple question of "can I do this".  The answer
I believe has been posted a long time ago.  I am not about to change the
way that I do my inter-chassis communications until the IEEE or RFC
community give me an address change for inter-chassis communication.
(which I believe is coming down the road).

And would you please subscribe to the list so I don't have the cc the
world every time?

Thanks.

-stv

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 15:01 Steve Iribarne
@ 2005-03-09 16:00 ` jamal
  2005-03-10  6:48 ` Catalin(ux aka Dino) BOIE
  1 sibling, 0 replies; 52+ messages in thread
From: jamal @ 2005-03-09 16:00 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Wed, 2005-03-09 at 10:01, Steve Iribarne wrote:
> First off, apologies for the all the cc's on this.  I hate doing it, but
> I will only do it for this post!
> 

I am not on linux-net - if you insist that i join just so i can see your
post then you are being unreasonable. I am not on Linux kernel either. 

Theres other reasons why multi CCs are useful. Sometimes the list never
echoes back the response - case in point my post this morning that was
responded to by Zdenek was not echoed upto this point on netdev - it may
show up sometime tonight.

> -> 1) Addresses for intra-chasis communication.
> -> The addresses used by the blades are intrachasis relevant only and
> the
> -> packets never leave the box. The blades are interconnected via some
> -> L2/VLAN/bridge within the chasis.
> -> 
> 
> Big assumption here.  The VLAN/Bridge/Router that I have in my chassis
> is hooked up to a switch.  The switch will NOT send the packets on my
> mgmt VLAN out over the network.  
> 
> (see below for more details on this.. in the "what am I missing" section
> )
> 

Your blades --> VLANX/SubnetX 
     --> [some L3 switch] 
             -->VLANY/SubnetY 
                    -->outside

The Blades discovery etc happens within the collision domain of VLANX. 
To go across from VLANX<->VLANY you may need either to L3 forward, NAT,
tunnel etc. If you do pure L3 forwarding then your blades addresses are
accessible outside. 
In other words all this is a config choice.
You may have more than one VLAN for management etc within your blades
but thats beside the point.

> 
> -> Conclusion:
> -> If these packets never leave the box - no ARP will ever see them and
> no
> -> dynamic routing protocol will ever advertise them - therefore no IP
> -> address collision. You can use _whatever_ address you want, private
> -> public, IBMs, intels etc. Do we agree on this? In other words hack
> not
> -> needed here.
> 
> Wrong.  Packets need to leave each blade.  You cannot treat the blades
> as a private entity.  You must ARP to find out the other blades MAC
> address.
> 

Read what i wrote again and cross reference with the diagram. ARP is
only L2 switched. It would be wise to configure the blade IP addresses
to be within the same subnet - in which case the only route you need on
your blades is a link scope one and perhaps a default GW pointing to
your L3 device.

> -> thats going to collide use 10.0.0.0/28"
> -> Summary: You may need to go to your box and reconfigure its external
> -> looking
> -> addresses.
> -> 
> 
> I _use_ to do exactly what you stated above.  When RFC 1918 first came
> out I used the 10 net.  

[..]

> Solution to bug1:  Easy, let the user configure the mgmt network ip
> address.
> Customers answer to bug1 solution:  Get the hell out of here; you don't
> do out-of-band mgmt.  Do you know what a security risk this is for me?
> Blah blah blah....  Even though all inter-chassis communication was done
> securely, I couldn't convince them. I had a customer boot me out of his
> office and boot our company out **because** of my design.  Not a good
> feeling.

A customer should be able to say, "heres an address you can use for
management". The rest of it is your problem really. There are no bugs,
but there are config issues.

> 
> -> a') Using 127.x addresses. You -> NOC "can i use 127.0.0.x/22 subnet"
> -> they say either "sorry, our routers cant route 127.x" or "no Zdenek
> -> was here before you, thats going to collide use 127.0.0.0/28"
> -> 
> 
> This is __EXACTLY__ the behavior we want.  I want routers to drop those
> packets.  My inter-chassis communication better NOT go through a router.
> 

The interchassis does not go through a router at all (other than the one
in your chasis which may be used to do L3). Let me draw that diagram
again:

  Your blades --> VLANX/SubnetX 
     --> [some L3 switch] 
             -->VLANY/SubnetY 
                    -->outside

i.e the only way it would fo out is if you allowed it at the L3 switch
or NAT device etc.

So let me quote you above:

---
I _use_ to do exactly what you stated above.  When RFC 1918 first came
out I used the 10 net.  
---

Its just a matter of time before you say "oh, thats what i do now for
127.x". This is the point i have been trying to make all along.

> -> So tell me what i am missing!
> -> 
> 
> Experience.  

I think you are making some very big assumption ;-> Please dont go this
path unless you wish to end this thread.

Btw, i do believe what you and Zdenek are trying to solve are _very_
different problems. He is trying to build a distributed router of some
form; i.e his blades are infact line-cards where traffic comes in.
You on the other hand seem to have the blades doing computes (i.e they
are not router line cards). 

The point is this: Whatever you folks are doing, probably inherited from
some other projects more than likely using some other OS is not
necessary in Linux. I respect your desire to use those addresses if it
makes you comfortable - I just vehemently disagree it is needed.
So i hope you dont show up with the patch and ask for its inclusion.

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 14:18                                         ` jamal
@ 2005-03-09 16:46                                           ` Jason Lunz
  2005-03-10 10:10                                             ` Henrik Nordstrom
  2005-03-09 17:52                                           ` Matt Mackall
  1 sibling, 1 reply; 52+ messages in thread
From: Jason Lunz @ 2005-03-09 16:46 UTC (permalink / raw)
  To: netdev; +Cc: linux-net

hadi@cyberus.ca said:
> Are we still talking about the same problem? The linecards addresses
> and interconnect interfaces are "internal". They are never
> advertised/seen outside of the chasis. So if you choose 18.7.22.69/32
> to use internally you make sure it is never advertised to the outside
> world as belonging to you. If you have to advertise it or actually
> know it is used, then you must deal with the conflict.

I think you're both in agreement, however violently you try not to be.
The question, though, is: *How* do you configure the nodes within the
chassis such that the internal IPs (whatever they are) _stay_ internal,
and any non-127/8 addressing can be used for the external interfaces?

I've done something similar, for example, using policy routing and the
arp sysctls. Suppose you have a machine with 2 interfaces, and you want
IP routing to happen on each of the two interfaces as independently as
possible. My solution involves using the "iif" modifier in your routing
rules ("ip rule" rules) to send packets to two completely different
routing tables, and making sure arp doesn't bleed across the two
interfaces. I don't know whether policy routing gives enough control to
do this in a general fashion; i did it only for very specific types of
traffic. But I suspect you could come up with something workable.

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-09 17:33 Steve Iribarne
  2005-03-09 19:40 ` jamal
  0 siblings, 1 reply; 52+ messages in thread
From: Steve Iribarne @ 2005-03-09 17:33 UTC (permalink / raw)
  To: hadi
  Cc: Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

-> Your blades --> VLANX/SubnetX
->      --> [some L3 switch]

umm.. I have a L2 switch... not L3 switch.

-> Read what i wrote again and cross reference with the diagram. ARP is
-> only L2 switched. It would be wise to configure the blade IP
addresses
-> to be within the same subnet - in which case the only route you need
on
-> your blades is a link scope one and perhaps a default GW pointing to
-> your L3 device.
-> 

L2 device.

> 
-> Its just a matter of time before you say "oh, thats what i do now for
-> 127.x". This is the point i have been trying to make all along.
-> 
-> > -> So tell me what i am missing!
-> > ->
-> >
-> > Experience.
-> 
-> I think you are making some very big assumption ;-> Please dont go
this
-> path unless you wish to end this thread.
-> 

Ok.  I won't.

-> Btw, i do believe what you and Zdenek are trying to solve are _very_
-> different problems. He is trying to build a distributed router of
some
-> form; i.e his blades are infact line-cards where traffic comes in.
-> You on the other hand seem to have the blades doing computes (i.e
they
-> are not router line cards).

Nope.  I'm in a very similar situation.  I don't have "line-cards"
per-se.  Not like xDSL or the type of line cards I've worked on in the
past, but my boards **DO** have both mgmt traffic and network traffic
coming into them.  
I have signaling traffic, bearer traffic and network mgmt traffic.

Very much the same.  I have basically 5 VLANS setup.  4 of which get
tagged at the switch so that when the packets come inside the chassis, I
know how to handle them.  1 of the VLANS is for mgmt communication.
Like I said, it works great.  The 127.xx net I will NEVER need to talk
to outside of my chassis and when I do chassis to chassis redundancy I
use a different scheme.

So I will never run into the "gee, someone else is using the 127.xx net"
because as long as my applications know how to get to the 127.xx net the
traffic will be sent to the proper ports and get tagged with the proper
vlan ID.

I wish my setup was easy enough to just draw a quick picture.  But it
aint, sorry.

-> 
-> The point is this: Whatever you folks are doing, probably inherited
from
-> some other projects more than likely using some other OS is not
-> necessary in Linux. I respect your desire to use those addresses if
it
-> makes you comfortable - I just vehemently disagree it is needed.

Nope.  Wrote the project from scratch using Embedded Linux.

Again, if you can show me a way of doing this, I'm all ears, but so far,
you haven't shown me any other way around it.  Believe me.  I've tried
and tried to find another solution to this problem.  

And I can't emphasis enough that telling a customer, "Pick a network
address range that I can use" is NOT, I have to repeat, NOT a solution.
The will NEVER NEVER NEVER go for it.  Maybe your customers will but
mine wont.  My customers are RBOCs and the like.  Anyone dealing with
them knows what I am talking about.

-> So i hope you dont show up with the patch and ask for its inclusion.
-> 

Nope.  Never will.  I don't think the big boys (Alan Cox, Linus and the
like) would never let it happen because there is no RFC on it.  There
will be someday.  I do know that there are ideas floating around right
now that will soon become the beginnings of the RFC.  

-> cheers,
-> jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 14:18                                         ` jamal
  2005-03-09 16:46                                           ` Jason Lunz
@ 2005-03-09 17:52                                           ` Matt Mackall
  2005-03-10  6:57                                             ` Catalin(ux aka Dino) BOIE
  1 sibling, 1 reply; 52+ messages in thread
From: Matt Mackall @ 2005-03-09 17:52 UTC (permalink / raw)
  To: jamal
  Cc: Zdenek Radouch, Henrik Nordstrom, Martin Mares, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Wed, Mar 09, 2005 at 09:18:10AM -0500, jamal wrote:
> On Wed, 2005-03-09 at 08:39, Zdenek Radouch wrote:
> > At 07:39 AM 3/9/05 -0500, jamal wrote:
> 
> [..]
> > Imagine a simple gateway, connecting two parts of your company
> >  - the east
> > interface connects to a corporate net with a default gateway, the west net
> > is the software dept. net.  Now imagine that you give your internal line card
> > in this simple gateway a "_whatever_" address, say 18.7.22.69.
> > Your gateway now has a route 18.7.22.69/32 -> dev linecard
> > Now please tell me what happens when a guy on the west net tries
> > to check his MIT evening class schedule.
> 
> Are we still talking about the same problem? The linecards addresses and
> interconnect interfaces are "internal". They are never advertised/seen
> outside of the chasis. So if you choose 18.7.22.69/32 to use internally
> you make sure it is never advertised to the outside world as belonging
> to you. If you have to advertise it or actually know it is used, then
> you must deal with the conflict.

Jamal, he's building a router. A router must be transparent to _all_
addresses that might be seen outside the "box". Reconfiguring such
internal details per installation is not acceptable. It would not be
ok if 18.7.22.69 mysteriously disappeared when the customer hammered
random addresses through it, even if said address was 'owned' by the
vendor. The customer might be testing their own equipment for net
deployment!

The only addresses he might not legitimately see on the wire are the
loopback ones. The routers I worked on at Cisco that had internal
networks did exactly this, by the way.

> If the router upstream from you used the same hack you end up being in
> trouble.

Uh, why? The 127 packets never leave the "box".

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 17:33 Steve Iribarne
@ 2005-03-09 19:40 ` jamal
  0 siblings, 0 replies; 52+ messages in thread
From: jamal @ 2005-03-09 19:40 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Wed, 2005-03-09 at 12:33, Steve Iribarne wrote:
> -> Your blades --> VLANX/SubnetX
> ->      --> [some L3 switch]
> 
> umm.. I have a L2 switch... not L3 switch.
> 

Lets go over this slowly so we can hopefully resolve why we dont see eye
to eye. I am not sure why i am spending all this energy on this.

Lets get the diagrams better:

1) your case:
    Your blades <--> VLANX/SubnetX
      <--> [some L2 switch]
           <--> VLANY/SubnetY <--> outside world

You probably have redundancy etc in some ATCA||2.16 setup with links going to 
two internal switches - but lets also ignore that - just assume the simple 
switch for now for sake of clarity. You may also have many VLANs in/out  like you 
said "signaling traffic, bearer traffic  and network mgmt traffic", but the 
two internal vs external interfaces  i showed above should suffice to indicate 
the general picture. Agreed?

To sumarize, for you to get to/from the outside world to your blades you go 
via L2 switch with a "few" interfaces to the ouside world.
In your case the "internal" interface is the VLANX port(s) facing the switch.
The "external" interface is the port(s) on VLANY facing the outside.

2) Note this is slightly different from Zdenek, which is:

  Outside <->one or more interfaces <->  [LinecardX] <-->[swicth/fabric]
  Outside <->one or more interfaces <->  [LinecardY] <-->[swicth/fabric]   
  .
  .

In other words each line card  has many interfaces that come into the box. 
It is not unusual to find 12-48 interface line cards.  The "switch" aka "fabric"  
connects these line cards (and perhaps some  control plane blade(s)). Typically such 
a switch will not run IP but rather some other internal thing like CSIX or SPI-x etc.

In both setups, if you do run IP internally, it does make sense not "leak" internal 
traffic to the outside world with such addresses. 
In both cases you both try hard (and i am sure succeed) to not leak those packets 
out - In your case its a simple separation of collision domains. The only way you can
get from internal to external is if infact you have L2 connectivity between the two
(since you said you dont have L3 switching in your chasis). 

By making the 127.x routable in linux of all places - which is where i started 
disagreeing, you introduce some challenges with hope that 127.x obscurity is
going to help.

To avoid confusion and have Zdenek respond when i am talking to you or viceversa
lets make the two as separate issues:

1) In your case i saw no reason for you to use 127.x - you could have achieved the 
same with 10.x. Your internal packets will never leak out. You say you will have collisions
with customer; but then if i understood correctly you said these internal packets never 
the box.  So my conclusion was you didnt need the hack.

2)Zdenek's case 

Just avoiding the leak is not good enough if the 127.x is routable and someone else
is using it and he has to route such packets. In such a case, even if Zdenek  hides 
the internal network at some point  he will have to route a  packet coming into 
linecardX, port A to linecardY, port B. 
And for this reason he cant totaly avoid collision. This is why i called it survival 
via obscurity.

Note: I am not questioning his technique but i would never use it
myself. Lets say we can achieve the same goal in a different way.

> Again, if you can show me a way of doing this, I'm all ears, but so far,
> you haven't shown me any other way around it.  Believe me.  I've tried
> and tried to find another solution to this problem.  
> 

Lets talk about this when we are clear what the problem is. Fix up the
diagram above if it is wrong, then we can talk.

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-09 21:57 Steve Iribarne
  2005-03-10  0:11 ` jamal
  0 siblings, 1 reply; 52+ messages in thread
From: Steve Iribarne @ 2005-03-09 21:57 UTC (permalink / raw)
  To: hadi
  Cc: Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

It's not the routing of the packet that gets screwed up.  It's the
applications that my "intra" communication use.  

I do this...

I have a redundant system.  So two Ethernet switches that go to either a
switch/hubbed/routed network.  Not controlled by me, but by my
customers.

So you have duplicate three wire coming into both ends of my chassis.

------ net 1 --------|                   | ---- net 1 ----
                     |                   | 
------- net 2 -------|  Chassis 21 slots | ----- net 2 ----
                     |                   |
------- net 3 -------|                   | ---- net 3 ----

all three of those "outside" nets get to me by either a bridge, router
or hub.

My 19 boards internal need to talk to each other ALONG with talking to
the outside world.

Boards in slot 1/21 are the switches.
so boards 2-20 are my linux blades that talk to each other.

The switch is configured to have the VLANS.  Management traffic I tag on
a VLAN.  So when my host controller or any of the other linux blades
need to do host communication, they talk to ip address 127.100.xx.xx
which is associated with a VLAN tagged interface.

Traffic being sent to the outside world is tagged as it comes in from
the outside world (so I know where it came from), and sent to the proper
board.  
L2 switching stuff.

Traffic that I send back out to the outside world is tagged when it
leaves one of my blades so the switch knows which port to send it back
out on.  (net 1, net 2 or net 3)

Ok.. that being said...

The _only_ way I can have normal applications (ie. ping, telnet, nfs) to
work and _guarantee_ not intra communication problems is if I use the
127.xx net.  

I'm not sure what you are not getting.  I'm not talking about basic
routing.  I'm talking about getting applications not to collide.

Let me give you and example.

If I were to use the 10.100.xx.xx network for example.  I have an snmp
master-agent/sub-agent configuration.  So I have a host controller with
the 10.100.0.1 address and my subagents are 10.100.0.<slotnum>

Everything works great, everyone is happy, until someone from the
outside world (say net 2) tries to telnet to me with a host address of
say 10.100.0.73.  Well, my host controller will route that packet onto
my private network.  So when I go to respond to the telnet request I
will tag it for my internal network because that is what the FIB routing
tells it to do.

There it is.  I'm not going to spend anymore time on this, and neither
should you.  Like I said, I've been doing this for a darn long time, and
I have, as yet, to see anyone who can make this problem just work.
Other than the way I did it.  (I along with many others)

have a happy day.

-stv

-> -----Original Message-----
-> From: jamal [mailto:hadi@cyberus.ca]
-> Sent: Wednesday, March 09, 2005 11:41 AM
-> To: Steve Iribarne
-> Cc: Henrik Nordstrom; Martin Mares; Zdenek Radouch; Eran Mann; Thomas
-> Graf; Andi Kleen; netdev@oss.sgi.com; linux-net@vger.kernel.org
-> Subject: RE: Do you know the TCP stack? (127.x.x.x routing)
-> 
-> On Wed, 2005-03-09 at 12:33, Steve Iribarne wrote:
-> > -> Your blades --> VLANX/SubnetX
-> > ->      --> [some L3 switch]
-> >
-> > umm.. I have a L2 switch... not L3 switch.
-> >
-> 
-> Lets go over this slowly so we can hopefully resolve why we dont see
eye
-> to eye. I am not sure why i am spending all this energy on this.
-> 
-> Lets get the diagrams better:
-> 
-> 1) your case:
->     Your blades <--> VLANX/SubnetX
->       <--> [some L2 switch]
->            <--> VLANY/SubnetY <--> outside world
-> 
-> You probably have redundancy etc in some ATCA||2.16 setup with links
-> going to
-> two internal switches - but lets also ignore that - just assume the
-> simple
-> switch for now for sake of clarity. You may also have many VLANs
in/out
-> like you
-> said "signaling traffic, bearer traffic  and network mgmt traffic",
but
-> the
-> two internal vs external interfaces  i showed above should suffice to
-> indicate
-> the general picture. Agreed?
-> 
-> To sumarize, for you to get to/from the outside world to your blades
you
-> go
-> via L2 switch with a "few" interfaces to the ouside world.
-> In your case the "internal" interface is the VLANX port(s) facing the
-> switch.
-> The "external" interface is the port(s) on VLANY facing the outside.
-> 
-> 2) Note this is slightly different from Zdenek, which is:
-> 
->   Outside <->one or more interfaces <->  [LinecardX]
<-->[swicth/fabric]
->   Outside <->one or more interfaces <->  [LinecardY]
<-->[swicth/fabric]
->   .
->   .
-> 
-> In other words each line card  has many interfaces that come into the
-> box.
-> It is not unusual to find 12-48 interface line cards.  The "switch"
aka
-> "fabric"
-> connects these line cards (and perhaps some  control plane blade(s)).
-> Typically such
-> a switch will not run IP but rather some other internal thing like
CSIX
-> or SPI-x etc.
-> 
-> In both setups, if you do run IP internally, it does make sense not
-> "leak" internal
-> traffic to the outside world with such addresses.
-> In both cases you both try hard (and i am sure succeed) to not leak
those
-> packets
-> out - In your case its a simple separation of collision domains. The
only
-> way you can
-> get from internal to external is if infact you have L2 connectivity
-> between the two
-> (since you said you dont have L3 switching in your chasis).
-> 
-> By making the 127.x routable in linux of all places - which is where
i
-> started
-> disagreeing, you introduce some challenges with hope that 127.x
obscurity
-> is
-> going to help.
-> 
-> To avoid confusion and have Zdenek respond when i am talking to you
or
-> viceversa
-> lets make the two as separate issues:
-> 
-> 1) In your case i saw no reason for you to use 127.x - you could have
-> achieved the
-> same with 10.x. Your internal packets will never leak out. You say
you
-> will have collisions
-> with customer; but then if i understood correctly you said these
internal
-> packets never
-> the box.  So my conclusion was you didnt need the hack.
-> 
-> 2)Zdenek's case
-> 
-> Just avoiding the leak is not good enough if the 127.x is routable
and
-> someone else
-> is using it and he has to route such packets. In such a case, even if
-> Zdenek  hides
-> the internal network at some point  he will have to route a  packet
-> coming into
-> linecardX, port A to linecardY, port B.
-> And for this reason he cant totaly avoid collision. This is why i
called
-> it survival
-> via obscurity.
-> 
-> Note: I am not questioning his technique but i would never use it
-> myself. Lets say we can achieve the same goal in a different way.
-> 
-> > Again, if you can show me a way of doing this, I'm all ears, but so
-> far,
-> > you haven't shown me any other way around it.  Believe me.  I've
tried
-> > and tried to find another solution to this problem.
-> >
-> 
-> Lets talk about this when we are clear what the problem is. Fix up
the
-> diagram above if it is wrong, then we can talk.
-> 
-> cheers,
-> jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 12:39                                     ` jamal
  2005-03-09 13:39                                       ` Zdenek Radouch
@ 2005-03-09 22:34                                       ` Henrik Nordstrom
  2005-03-10  1:47                                         ` Jamie Lokier
  1 sibling, 1 reply; 52+ messages in thread
From: Henrik Nordstrom @ 2005-03-09 22:34 UTC (permalink / raw)
  To: jamal
  Cc: Martin Mares, Zdenek Radouch, Steve Iribarne, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Wed, 9 Mar 2005, jamal wrote:

> I am afraid this 127.x panacea is begining to sound like the tale of
> some insane emperor who was naked but people around him sucking up to
> him telling him how fine his clothes looked. I am having a very hard
> time seeing the rationale - infact its driving me nuts, so please bear
> with me.

yes, if it wasn't for routing conflicts when a node is in both the 
external and the chassis network.

If Linux could manage different IP stacks per interface this would not be 
a problem, but as it is today the same IP stack is used for all interfaces 
making dual homing (not routing) a bit troublesome when the same addresses 
may be in both networks..

> within customer's  network. To quote Zdenek:
> You couldn't walk in the NOC and tell them: "You can't use the 10.x
> net to manage your equipment - my box is already using that net".
> Conclusion:
> You walk into the NOC and say "can i use 10.0.0.x/22 subnet" they say "no
> thats going to collide use 10.0.0.0/28"
> Summary: You may need to go to your box and reconfigure its external looking
> addresses.

Yes.

But also it's internal in order to maintain any sanity in the nodes 
connected to both worlds.

> a') Using 127.x addresses. You -> NOC "can i use 127.0.0.x/22 subnet"
> they say either "sorry, our routers cant route 127.x" or "no Zdenek
> was here before you, thats going to collide use 127.0.0.0/28"

This has never been a topic. The use of the 127.X addresses is purely for 
inter-chassis communication, never visible outside of the chassis.

By using the 127.X addresses for intra-chassis communication you are 
guaranteed that there is never conflicts with addresses used on the LAN, 
and the routing tables of the chassis nodes which needs to speak to both 
worlds can be maintained sane without requiring configuration of the 
chassis-internal network not even visible to the administrator.

The available choices are in reality (not exlusive, pick any number)

a) Ask IANA for an address block for intra-chassis communication and hope 
the local LAN is not abusing these addresses.

b) use the IPv4 link-local address block already registered, with the 
condition that this is not used on the local LAN to avoid collisions.

c) Use 127.X. Ensures that no matter what the local LAN looks like 
behaviour of the externally connected nodes will be what is expected (can 
handle any valid IP, except for 127.X which they are not expected to 
handle)

d) Use a configureable address block.

e) Reimplement the IP stack in Linux to have separate IP stacks per 
interface, and modify all applications to not only specify the destination 
IP but also which interface to use when talking to either the internal 
network or external world.

f) Not use IPv4 for the intra-chassis communication.

Regards
Henrik

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-09 23:51 Boian Bonev
  2005-03-10  0:23 ` Jason Lunz
  0 siblings, 1 reply; 52+ messages in thread
From: Boian Bonev @ 2005-03-09 23:51 UTC (permalink / raw)
  To: Jason Lunz, netdev; +Cc: linux-net

[-- Attachment #1: Type: text/plain, Size: 1913 bytes --]

> The question, though, is: *How* do you configure the nodes within the
> chassis such that the internal IPs (whatever they are) _stay_ internal,
> and any non-127/8 addressing can be used for the external interfaces?
> 
> I've done something similar, for example, using policy routing and the
> arp sysctls. Suppose you have a machine with 2 interfaces, and you want
> IP routing to happen on each of the two interfaces as independently as
> possible. My solution involves using the "iif" modifier in your routing
> rules ("ip rule" rules) to send packets to two completely different
> routing tables, and making sure arp doesn't bleed across the two
> interfaces. I don't know whether policy routing gives enough control to
> do this in a general fashion; i did it only for very specific types of
> traffic. But I suspect you could come up with something workable.

you can do that but you omit the interface addresses - suppose ext net is 10.20.10.1/24,
internal is 10.10.10.1/24, no matter what routing policies and rules you put, both interface
ips will be visible from both interfaces. now imagine you have another external net 10.30.10.1/24
and customer wants to route e.g. 10.10.0.0/16 from 10.20.10.1/24 via 10.30.10.5...
at least host 10.10.10.1 will not route but arrive locally to your blade host

btw. i have seen recently on iptables' patch-o-matic some module that could by condition route
traffic to local addresses to another host. anyway the whole thing is doable with any kind of 
addresses but just imagine what nightmare startup ruleset you will have on each box; then
modify your custom rules to conform that hell ruleset and... imho it will be much more easy to
create a custom transport over ethernet (in case your ext network could share addresses form
that protocol also) and forget about ipv4 for internal implementation. thus you'd have at least
better security between worlds ;-)

b.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 21:57 Steve Iribarne
@ 2005-03-10  0:11 ` jamal
  0 siblings, 0 replies; 52+ messages in thread
From: jamal @ 2005-03-10  0:11 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

On Wed, 2005-03-09 at 16:57, Steve Iribarne wrote:

> There it is.  I'm not going to spend anymore time on this, and neither
> should you.  Like I said, I've been doing this for a darn long time, and
> I have, as yet, to see anyone who can make this problem just work.
> Other than the way I did it.  (I along with many others)
> 

Lets end this thread - we are clearly never gonna get to a consensus.

Whoever asked the question on how to expose loopback got their answer
and you say you are not pushing the patch - so we are all one big happy
family.

cheers,
jamal

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 23:51 Do you know the TCP stack? (127.x.x.x routing) Boian Bonev
@ 2005-03-10  0:23 ` Jason Lunz
  0 siblings, 0 replies; 52+ messages in thread
From: Jason Lunz @ 2005-03-10  0:23 UTC (permalink / raw)
  To: Boian Bonev; +Cc: netdev, linux-net

On Thu, Mar 10, 2005 at  1:51AM +0200, Boian Bonev wrote:
> you can do that but you omit the interface addresses - suppose ext net
> is 10.20.10.1/24, internal is 10.10.10.1/24, no matter what routing
> policies and rules you put, both interface ips will be visible from
> both interfaces.

What do you mean by "visible"? If you're referring to arp, the arp
sysctls are probably adequate, and there's arpfilter if not.

> now imagine you have another external net 10.30.10.1/24 and customer
> wants to route e.g. 10.10.0.0/16 from 10.20.10.1/24 via 10.30.10.5...
> at least host 10.10.10.1 will not route but arrive locally to your
> blade host

Not if you take 10.10.10.1 out of the "local" routing table, and policy
route that traffic only through tables that don't consider 10.10.10.1
local. I'm not saying it's trivial, but if you set your rules up right,
you can make some packets be routed by *completely different routing
tables* than others.

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 22:34                                       ` Henrik Nordstrom
@ 2005-03-10  1:47                                         ` Jamie Lokier
  0 siblings, 0 replies; 52+ messages in thread
From: Jamie Lokier @ 2005-03-10  1:47 UTC (permalink / raw)
  To: Henrik Nordstrom
  Cc: jamal, Martin Mares, Zdenek Radouch, Steve Iribarne, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

Henrik Nordstrom wrote:
> If Linux could manage different IP stacks per interface this would not be 
> a problem, but as it is today the same IP stack is used for all interfaces 
> making dual homing (not routing) a bit troublesome when the same addresses 
> may be in both networks..

Indeed, I have exactly the same problem with a device that must
simultaneously connect to:

     - the local customer-site ethernet
     - the local customer-site 802.11 wireless

and auto-configure both interfaces using DHCP to connect to hosts on
the internet as best as possible through all available interfaces.
There is absolutely no guarantee that I won't see a network or even
address conflict on the two interfaces, as they may be _separate_
networks each behind a NAT to the outside world over ADSL.

In fact, it's quite likely that DHCP for each interface will provide a
192.168.0.0/24 address, as that seems to be the typical setup of both
kinds of ADSL NAT router...

Any suggestion of asking customer-site to specially configure their
network rather defeats the point, which is a device which
automatically tries available connections, using DHCP, and routes its
traffic over whichever one works best at any time.

-- Jamie

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 15:01 Steve Iribarne
  2005-03-09 16:00 ` jamal
@ 2005-03-10  6:48 ` Catalin(ux aka Dino) BOIE
  1 sibling, 0 replies; 52+ messages in thread
From: Catalin(ux aka Dino) BOIE @ 2005-03-10  6:48 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: hadi, Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

> 1st bug:  Customer had the same 10.100.xx.xx/24 net that I had and my
> inter-system communication wouldn't work, because all my routes got
> screwed up.  (i.e the SNMP sub-agents couldn't talk with the master).
>
> 1st response to bug:  Well can you use another network address range?
> Customer response:  Hell no.
>
> Solution to bug1:  Easy, let the user configure the mgmt network ip
> address.
> Customers answer to bug1 solution:  Get the hell out of here; you don't
> do out-of-band mgmt.  Do you know what a security risk this is for me?
> Blah blah blah....  Even though all inter-chassis communication was done
> securely, I couldn't convince them. I had a customer boot me out of his
> office and boot our company out **because** of my design.  Not a good
> feeling.

You say that a client will not allow you to use net 10.
OK, but, the same client would not allow you to use 127/8 because they use 
it!
What I'm saying is that 10.0.0.0/8 and 127.0.0.0/8 are the same. The 
customer can use them.
You assume that the client will not use 127/8. Why? This is wrong.
You can use it, the client can use it.

---
Catalin(ux aka Dino) BOIE
catab at deuroconsult.ro
http://kernel.umbrella.ro/

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 17:52                                           ` Matt Mackall
@ 2005-03-10  6:57                                             ` Catalin(ux aka Dino) BOIE
  0 siblings, 0 replies; 52+ messages in thread
From: Catalin(ux aka Dino) BOIE @ 2005-03-10  6:57 UTC (permalink / raw)
  To: Matt Mackall
  Cc: jamal, Zdenek Radouch, Henrik Nordstrom, Martin Mares, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

>
> Uh, why? The 127 packets never leave the "box".

So you are allowed to use 127/8 but your client cannot break this rule. 
Why?

---
Catalin(ux aka Dino) BOIE
catab at deuroconsult.ro
http://kernel.umbrella.ro/

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-09 16:46                                           ` Jason Lunz
@ 2005-03-10 10:10                                             ` Henrik Nordstrom
  0 siblings, 0 replies; 52+ messages in thread
From: Henrik Nordstrom @ 2005-03-10 10:10 UTC (permalink / raw)
  To: Jason Lunz; +Cc: linux-net, netdev

On Wed, 9 Mar 2005, Jason Lunz wrote:

> interfaces. I don't know whether policy routing gives enough control to
> do this in a general fashion;

Not easily. The problem is to determine which path packets you send should 
take if the destination is on both sides.

You can do some magics with connection tracking and CONNMARK and magic 
routing tables infront of the local table and disabling the martian 
checks, but not everything can be solved in this manner.

Regards
Henrik

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-10 14:35 Steve Iribarne
  2005-03-10 14:49 ` Dmitry Torokhov
  0 siblings, 1 reply; 52+ messages in thread
From: Steve Iribarne @ 2005-03-10 14:35 UTC (permalink / raw)
  To: Catalin(ux aka Dino) BOIE
  Cc: hadi, Henrik Nordstrom, Martin Mares, Zdenek Radouch, Eran Mann,
	Thomas Graf, Andi Kleen, netdev, linux-net

-> You say that a client will not allow you to use net 10.
-> OK, but, the same client would not allow you to use 127/8 because
they
-> use
-> it!
-> What I'm saying is that 10.0.0.0/8 and 127.0.0.0/8 are the same. The
-> customer can use them.
-> You assume that the client will not use 127/8. Why? This is wrong.
-> You can use it, the client can use it.
-> 

No.  If the client uses 127/8 on a linux box, it is just a loopback and
will never go out on the wire and the applications (i.e. telnet, ftp,
ping, whatever) will just loopback.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-10 14:35 Steve Iribarne
@ 2005-03-10 14:49 ` Dmitry Torokhov
  0 siblings, 0 replies; 52+ messages in thread
From: Dmitry Torokhov @ 2005-03-10 14:49 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: Catalin(ux aka Dino) BOIE, hadi, Henrik Nordstrom, Martin Mares,
	Zdenek Radouch, Eran Mann, Thomas Graf, Andi Kleen, netdev,
	linux-net

On Thu, 10 Mar 2005 06:35:43 -0800, Steve Iribarne
<steve.iribarne@dilithiumnetworks.com> wrote:
> -> You say that a client will not allow you to use net 10.
> -> OK, but, the same client would not allow you to use 127/8 because
> they
> -> use
> -> it!
> -> What I'm saying is that 10.0.0.0/8 and 127.0.0.0/8 are the same. The
> -> customer can use them.
> -> You assume that the client will not use 127/8. Why? This is wrong.
> -> You can use it, the client can use it.
> ->
> 
> No.  If the client uses 127/8 on a linux box, it is just a loopback and
> will never go out on the wire and the applications (i.e. telnet, ftp,
> ping, whatever) will just loopback.
> 

This assumes that the client did not apply the same hack you did.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
@ 2005-03-10 15:04 Steve Iribarne
  2005-03-10 15:25 ` Catalin(ux aka Dino) BOIE
  0 siblings, 1 reply; 52+ messages in thread
From: Steve Iribarne @ 2005-03-10 15:04 UTC (permalink / raw)
  To: dtor_core
  Cc: Catalin(ux aka Dino) BOIE, hadi, Henrik Nordstrom, Martin Mares,
	Zdenek Radouch, Eran Mann, Thomas Graf, Andi Kleen, netdev,
	linux-net

-> This assumes that the client did not apply the same hack you did.
-> 

They wouldn't do it on the same machine or system. I control my internal
system.  It's an embedded system which means I have complete control.
If they applied it somewhere else on the network, I wouldn't receive any
of those packets because the router would drop it and my patch drops the
127.xxx rcv'd packets if not received on the proper VLAN and my system
doesn't accept tagged packets from the outside world.

I don't think I'm the only one who has done this.  As a matter of fact,
I KNOW I'm not.  When I worked on BSD years ago (10+), I worked for a
company who did the same sort of hack.

-stv

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: Do you know the TCP stack? (127.x.x.x routing)
  2005-03-10 15:04 Steve Iribarne
@ 2005-03-10 15:25 ` Catalin(ux aka Dino) BOIE
  0 siblings, 0 replies; 52+ messages in thread
From: Catalin(ux aka Dino) BOIE @ 2005-03-10 15:25 UTC (permalink / raw)
  To: Steve Iribarne
  Cc: dtor_core, hadi, Henrik Nordstrom, Martin Mares, Zdenek Radouch,
	Eran Mann, Thomas Graf, Andi Kleen, netdev, linux-net

> -> This assumes that the client did not apply the same hack you did.

:) Exactly!

> They wouldn't do it on the same machine or system. I control my internal
> system.  It's an embedded system which means I have complete control.
> If they applied it somewhere else on the network, I wouldn't receive any
> of those packets because the router would drop it and my patch drops the
> 127.xxx rcv'd packets if not received on the proper VLAN and my system
> doesn't accept tagged packets from the outside world.
>
> I don't think I'm the only one who has done this.  As a matter of fact,
> I KNOW I'm not.  When I worked on BSD years ago (10+), I worked for a
> company who did the same sort of hack.

I think that the best way is to choose a 10.0.0.0/24|16|8 net and let the 
client to configure other net if it has the same net somewhere.

I agree that 127 is a lot less used that 10, but you never know.

Probably you will choose anyway 127, but I suggest to have a possibility 
the user choose another class if he wants.

---
Catalin(ux aka Dino) BOIE
catab at deuroconsult.ro
http://kernel.umbrella.ro/

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2005-03-10 15:25 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-09 23:51 Do you know the TCP stack? (127.x.x.x routing) Boian Bonev
2005-03-10  0:23 ` Jason Lunz
  -- strict thread matches above, loose matches on Subject: below --
2005-03-10 15:04 Steve Iribarne
2005-03-10 15:25 ` Catalin(ux aka Dino) BOIE
2005-03-10 14:35 Steve Iribarne
2005-03-10 14:49 ` Dmitry Torokhov
2005-03-09 21:57 Steve Iribarne
2005-03-10  0:11 ` jamal
2005-03-09 17:33 Steve Iribarne
2005-03-09 19:40 ` jamal
2005-03-09 15:01 Steve Iribarne
2005-03-09 16:00 ` jamal
2005-03-10  6:48 ` Catalin(ux aka Dino) BOIE
2005-03-08 15:07 Steve Iribarne
2005-03-06  2:20 Zdenek Radouch
2005-03-06  9:56 ` Martin Mares
2005-03-06 17:01   ` Zdenek Radouch
2005-03-06 17:12     ` alex
2005-03-06 17:31     ` Thomas Graf
2005-03-06 19:48       ` Zdenek Radouch
2005-03-06 20:19         ` alex
2005-03-06 20:19         ` Andi Kleen
2005-03-06 20:45           ` Thomas Graf
2005-03-06 21:30             ` Andi Kleen
2005-03-06 21:50               ` Thomas Graf
2005-03-06 21:50             ` Zdenek Radouch
2005-03-07  7:01               ` Sumit Pandya
2005-03-07  8:05               ` Eran Mann
2005-03-07 12:14                 ` jamal
2005-03-07 23:50                 ` jamal
2005-03-08  3:15                   ` Zdenek Radouch
2005-03-08 13:34                     ` jamal
2005-03-08 13:51                       ` Martin Mares
2005-03-08 13:58                         ` jamal
2005-03-08 14:03                           ` Martin Mares
2005-03-08 14:17                             ` jamal
2005-03-08 14:20                               ` Martin Mares
2005-03-08 18:40                               ` Henrik Nordstrom
2005-03-08 21:17                                 ` jamal
2005-03-09  9:09                                   ` Henrik Nordstrom
2005-03-09 12:39                                     ` jamal
2005-03-09 13:39                                       ` Zdenek Radouch
2005-03-09 14:18                                         ` jamal
2005-03-09 16:46                                           ` Jason Lunz
2005-03-10 10:10                                             ` Henrik Nordstrom
2005-03-09 17:52                                           ` Matt Mackall
2005-03-10  6:57                                             ` Catalin(ux aka Dino) BOIE
2005-03-09 22:34                                       ` Henrik Nordstrom
2005-03-10  1:47                                         ` Jamie Lokier
2005-03-08 18:34                       ` Henrik Nordstrom
2005-03-09  5:33                       ` Zdenek Radouch
2005-03-08 14:02                     ` Thomas Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).