public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Strange Problems with ARP and Linux
@ 2006-01-17 12:20 Richard Mueller
  2006-01-17 13:28 ` Mathieu Chouquet-Stringer
  0 siblings, 1 reply; 2+ messages in thread
From: Richard Mueller @ 2006-01-17 12:20 UTC (permalink / raw)
  To: linux-kernel

Hy... (the same was posted to linux-net yesterday)

I experienced some strange behaviour with linux and the arp protocol.

1.) Kernel-Version: 2.6.11.7 plus grsec-patches

2.) Setup:

         +--------+
         | Router |
         +---+----+
             |
             |
      +------+-------+
      |              |
      |              | Transitnet for
      |              | Cluster/Router
 +----+-----+  +-----+------+
 | Primary  |  | Secondary  |
 +----+-----+  +-----+------+
      |              |
      |              | LAN
      +--------------+

Router:    C2600 router from ISP

Primary:   First(active) linux router
Secondary: Secondary(standby) linux router

Primary/Secondary are configured as a cluster
with the heartbeat package.

The cluster shares a IP-Alias in the transitnet and
many IPs in the LAN-segments. The IP-Alias is always
bound to one node at the same time.

Following IPs and MACs are used for this example:

transit-net:
Router:    10.0.0.1/24  | 00:10:F3:09:10:70
Primary:   10.0.0.10/24 | 00:10:F3:09:11:71
Secondary: 10.0.0.11/24 | 00:10:F3:09:12:72
IP-Alias:  10.0.0.20/24 | depends where it ist bound to

lan:
Primary:   10.1.0.10/24 | 00:10:F3:10:11:71
Secondary: 10.1.0.11/24 | 00:10:F3:10:12:72
IP-Alias:  10.1.0.20/24 | depends where it ist bound to

3.) The Problem

First everything works fine. If I fail the primary node,
the secondary does the take over. The ARP-Entrys are
changing to the MAC of the secondary, and everything is
fine.

Now if you want to ping/ssh/somewhat the shared IP-Alias
in the LAN from the networks behind the C2600 everthing begins:

I.  The C2600 is able to deliver the IP-packet to the node because
    it has a valid arp-entry.

II. The Linux-machine (secondary) does not have any arp-entrys
    (because it was inactive for a while) so it has to initiate
    ARP before it can deliver the answer IP-packet.

Then IT HAPPENS:

The Linux Box asks in the transit net:

0.000000 10.1.0.20 -> Broadcast    ARP Who has 10.0.0.1?  Tell 10.1.0.20

Why does Linux make ARP-requests with SRC-IPs from a different subnet?
This can't be the expected behaviour... :(

BTW:
The C2600 is so "smart" to put an entry with
"10.1.0.20 -> 00:10:F3:09:12:72"
in its ARP-Cache, based on this single ARP-Broadcast
from 10.1.0.20 and after a failback to the primary nobody can reach the
10.1.0.20... :-)


4.) Solution: Dirty Userspace Fix
  Ping the C2600 from the primary/secondary infinitely.
  The same does a ping-group in heartbeat.
  This can't be the real truth... ;-)

5.) Solution: Dirty Kernel-Patch
  With my skillful hands I wrote a dirty hack:
<patch>
--- arp.c       Fri Jan 13 16:44:06 2006
+++ arp.c.new   Fri Jan 13 16:43:52 2006
@@ -342,9 +342,9 @@
   switch (IN_DEV_ARP_ANNOUNCE(in_dev)) {
   default:
   case 0:              /* By default announce any local IP */
-       if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
+       /* if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
         saddr = skb->nh.iph->saddr;
-       break;
+       break; */
   case 1:              /* Restrict announcements of saddr in same subnet */
        if (!skb)
         break;
</patch>

6.) Solution: Clean Kernel-Patch
  Can anybody improve this patch above to a clean one so that it finds
  it way to the vanilla kernel?


bye
richard

-- 
Richard Müller
Geschäftsführer Technik

team(ix) GmbH
Powering Enterprise Linux Networks
Südwestpark 35
90449 Nürnberg

fon:   +49 (911) 30999- 0
fax:   +49 (911) 30999-99
mail:  rm@teamix.de
web:   http://www.teamix.de
vcf:   http://www.teamix.de/vcf/rm.vcf
gpg:   296C 0BAF 8FC8 DCE2 99BD
       5777 FA73 ECDC F9F1 8FF7


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Strange Problems with ARP and Linux
  2006-01-17 12:20 Strange Problems with ARP and Linux Richard Mueller
@ 2006-01-17 13:28 ` Mathieu Chouquet-Stringer
  0 siblings, 0 replies; 2+ messages in thread
From: Mathieu Chouquet-Stringer @ 2006-01-17 13:28 UTC (permalink / raw)
  To: Richard Mueller; +Cc: linux-kernel

mueller@teamix.net (Richard Mueller) writes:
> Hy... (the same was posted to linux-net yesterday)
> [..]

It's the expected behavior. Look for arp_ignore and arp_announce under
linux/Documentation/networking/ip-sysctl.txt

-- 
Mathieu Chouquet-Stringer
    "Le disparu, si l'on vénère sa mémoire, est plus présent et
                 plus puissant que le vivant".
           -- Antoine de Saint-Exupéry, Citadelle --

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-01-17 13:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-17 12:20 Strange Problems with ARP and Linux Richard Mueller
2006-01-17 13:28 ` Mathieu Chouquet-Stringer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox