Incorrect ARP behavior when multiple/none IPv4 address assigned to interface

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Incorrect ARP behavior when multiple/none IPv4 address assigned to interface
@ 2012-10-23 11:28 Sergey Popovich
  2012-10-24  0:15 ` Julian Anastasov
  0 siblings, 1 reply; 5+ messages in thread
From: Sergey Popovich @ 2012-10-23 11:28 UTC (permalink / raw)
  To: netdev

Hello!

We have following setup:
------------------------

  PC1              |             |
   ip: 10.0.1.2/24 |             | Linux Router (3.7-rc2)
   gw: 10.0.1.1    |--------eth0-| Lo0: 10.10.10.10/32
                                 | Lo255: 10.0.1.1/24
  PC2              |--------eth1-|        10.0.2.1/24
   ip: 10.0.1.3/24 |             | eth[0-2]: no ip address
   gw: 10.0.1.1    |             | ip route 10.0.1.2/32 dev eth0 src 10.0.1.1
                               +-| ip route 10.0.1.3/32 dev eth1 src 10.0.1.1
                               | | ip route 10.0.2.2/32 dev eth2 src 10.0.2.1
  PC3              |-----eth2--+
   ip: 10.0.2.2/24 |
   gw: 10.0.2.1    |


Problem with ARP Requests sent with incorrect source address 
(10.10.10.10 instead of 10.0.1.1):

# tcpdump -vv -ieth0 -s1500 -nnpe 'arp'
13:28:57.395181 08:00:27:3b:63:ae > 0a:00:27:00:00:00, ethertype ARP 
(0x0806), length 42: Ethernet (len 6),.
IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
13:28:58.395257 08:00:27:3b:63:ae > 0a:00:27:00:00:00, ethertype ARP 
(0x0806), length 42: Ethernet (len 6),.
IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
13:28:59.395207 08:00:27:3b:63:ae > 0a:00:27:00:00:00, ethertype ARP 
(0x0806), length 42: Ethernet (len 6),.
IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
13:29:01.393739 08:00:27:3b:63:ae > ff:ff:ff:ff:ff:ff, ethertype ARP 
(0x0806), length 42: Ethernet (len 6),.
IPv4 (len 4), Request who-has 10.0.1.2 tell 10.0.1.1, length 28
13:29:01.393862 0a:00:27:00:00:00 > 08:00:27:3b:63:ae, ethertype ARP 
(0x0806), length 60: Ethernet (len 6),.
IPv4 (len 4), Reply 10.0.1.2 is-at 0a:00:27:00:00:00, length 46

Detailed information about this (and other, that triggers same case with 
ARP) network topology usage in real world
can be found at https://bugzilla.kernel.org/show_bug.cgi?id=49311

Sorry for early bug report.

-- 
SP5474-RIPE
Sergey Popovich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect ARP behavior when multiple/none IPv4 address assigned to interface
  2012-10-23 11:28 Incorrect ARP behavior when multiple/none IPv4 address assigned to interface Sergey Popovich
@ 2012-10-24  0:15 ` Julian Anastasov
  2012-10-24  7:18   ` Sergey Popovich
  0 siblings, 1 reply; 5+ messages in thread
From: Julian Anastasov @ 2012-10-24  0:15 UTC (permalink / raw)
  To: Sergey Popovich; +Cc: netdev


	Hello,

On Tue, 23 Oct 2012, Sergey Popovich wrote:

> Hello!
> 
> We have following setup:
> ------------------------
> 
>  PC1              |             |
>   ip: 10.0.1.2/24 |             | Linux Router (3.7-rc2)
>   gw: 10.0.1.1    |--------eth0-| Lo0: 10.10.10.10/32
>                                 | Lo255: 10.0.1.1/24
>  PC2              |--------eth1-|        10.0.2.1/24
>   ip: 10.0.1.3/24 |             | eth[0-2]: no ip address
>   gw: 10.0.1.1    |             | ip route 10.0.1.2/32 dev eth0 src 10.0.1.1
>                               +-| ip route 10.0.1.3/32 dev eth1 src 10.0.1.1
>                               | | ip route 10.0.2.2/32 dev eth2 src 10.0.2.1
>  PC3              |-----eth2--+
>   ip: 10.0.2.2/24 |
>   gw: 10.0.2.1    |
> 
> 
> Problem with ARP Requests sent with incorrect source address (10.10.10.10
> instead of 10.0.1.1):
> 
> # tcpdump -vv -ieth0 -s1500 -nnpe 'arp'
> 13:28:57.395181 08:00:27:3b:63:ae > 0a:00:27:00:00:00, ethertype ARP (0x0806),
> length 42: Ethernet (len 6),.
> IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28

	What kind of packet triggers ARP request here?
May be this IP packet already has saddr=10.10.10.10 ?
arp_solicit() when eth0/arp_announce=0 (default) just
ensures that this saddr is local. Or it is a forwarding
case and inet_select_addr is used? Also, any reason to put
addresses on loopback and not on eth0?

> 13:28:58.395257 08:00:27:3b:63:ae > 0a:00:27:00:00:00, ethertype ARP (0x0806),
> length 42: Ethernet (len 6),.
> IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
> 13:28:59.395207 08:00:27:3b:63:ae > 0a:00:27:00:00:00, ethertype ARP (0x0806),
> length 42: Ethernet (len 6),.
> IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
> 13:29:01.393739 08:00:27:3b:63:ae > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806),
> length 42: Ethernet (len 6),.
> IPv4 (len 4), Request who-has 10.0.1.2 tell 10.0.1.1, length 28
> 13:29:01.393862 0a:00:27:00:00:00 > 08:00:27:3b:63:ae, ethertype ARP (0x0806),
> length 60: Ethernet (len 6),.
> IPv4 (len 4), Reply 10.0.1.2 is-at 0a:00:27:00:00:00, length 46
> 
> Detailed information about this (and other, that triggers same case with ARP)
> network topology usage in real world
> can be found at https://bugzilla.kernel.org/show_bug.cgi?id=49311

	Your case 2 can be also solved with proper ordering of
the primaries, eg. first add /32 primaries, then /31, ... /25, /24.
You can also use decreasing scope for the addresses if global
scope is not needed for them, it can help for the ordering.

	For the proposed patch: providing iph->saddr to
inet_select_addr() in icmp_send() looks better than before.
Still, inet_select_addr() is incorrect function to use
from icmp_send(), there is the risk to expose scope link
addresses.

	The other part from patch in inet_select_addr() looks
correct to me but comes with some price for the arp_solicit()
and icmp_send() cases, a slowdown that may not be liked by
others.

	About fib_info_update_nh_saddr: same fib_info can
be used for different subnets, so we can not check the
destination. But routes to directly connected hosts
usually come with prefsrc (proto kernel), so it is not a
problem.

> Sorry for early bug report.
> 
> -- 
> SP5474-RIPE
> Sergey Popovich

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect ARP behavior when multiple/none IPv4 address assigned to interface
  2012-10-24  0:15 ` Julian Anastasov
@ 2012-10-24  7:18   ` Sergey Popovich
  2012-10-24  9:06     ` Julian Anastasov
  0 siblings, 1 reply; 5+ messages in thread
From: Sergey Popovich @ 2012-10-24  7:18 UTC (permalink / raw)
  To: netdev

Julian Anastasov пишет:

>> # tcpdump -vv -ieth0 -s1500 -nnpe 'arp'
>> 13:28:57.395181 08:00:27:3b:63:ae>  0a:00:27:00:00:00, ethertype ARP (0x0806),
>> length 42: Ethernet (len 6),.
>> IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
>
> 	What kind of packet triggers ARP request here?
> May be this IP packet already has saddr=10.10.10.10 ?
> arp_solicit() when eth0/arp_announce=0 (default) just
> ensures that this saddr is local. Or it is a forwarding
> case and inet_select_addr is used? Also, any reason to put
> addresses on loopback and not on eth0?

1. Sorry, from bug report is not clear to undestand how this is reproduced:
   1.1. on PC1 run ping 10.0.1.1
   1.2. on Linux Router start arp-probe-bug.bsh as root user.
So this kind of packet generated after link layer addresses resolved &
ICMP Echo Request/Reply in progress (with no packet loss).

2. No I do not think that saddr=10.10.10.10 in probe.
arp_solicit calls inet_select_addr() only if saddr=0.0.0.0.
Call to arp_solicit() made from net/core/neighbour.c neigh_probe()
which is static and called from neigh_timer_handler() when entry ages
out and goes to PROBE phase.

3. Reason to put addr on loopback (not actually system loopback, but
linux dummy interface or any other network interface) described in more 
details in my bug report, but for short:

Suppose we have 10.0.1.0/24 subnet.
and many customers with single ip address (common for Internet
Providers) on differend broadcast domains (VLANs).

Each customer connected using 4(!) IPv4 address using traditional schema:

10.0.1.0/30 - Customer 1 (ip: 10.0.1.2,  gw: 10.0.1.1, mask: /30)
10.0.1.4/30 - Customer 2 (ip: 10.0.1.6,  gw: 10.0.1.5, mask: /30)
10.0.1.8/30 - Customer 3 (ip: 10.0.1.10, gw: 10.0.1.9, mask: /30)
...
10.0.1.252/30 - Customer 64 (ip: 10.0.1.254, gw: 10.0.1.253, mask: /30

As can be seen on each connection we waste at least 2 IP address:
   one for subnet address (all zeros in host part)
   one for subnet broadcast (all ones in host part)

More efficiently to use entire subnet with /24 mask and assign to each 
customer one ip from subnet with mask /24.

10.0.1.1/24 - Loopback (dummy) on Linux Router. This is gateway address
               to customers.
10.0.1.2/24 - Customer 1
10.0.1.3/24 - Customer 2
10.0.1.4/24 - Customer 3
...
10.0.1.254/24 - Customer 253

This schema called by some network equipment vendors as "ip unnumbered"
and works in Linux for years (and thus used with proper NICs by many
small/medium (and even large) ISPs to aggregate broadband customers).

> 	Your case 2 can be also solved with proper ordering of
> the primaries, eg. first add /32 primaries, then /31, ... /25, /24.
> You can also use decreasing scope for the addresses if global
> scope is not needed for them, it can help for the ordering.
>
Yes you are right, but it is not clear to understand at configure
time:-).

> 	For the proposed patch: providing iph->saddr to
> inet_select_addr() in icmp_send() looks better than before.
> Still, inet_select_addr() is incorrect function to use
> from icmp_send(), there is the risk to expose scope link
> addresses.
>
Correct, but using sysctl icmp_errors_use_inbound_ifaddr as for
me is not right entirely:
   ICMP might be generated for address that is not directly reachable
   (more than hop away from interface on which packet arrives -
    reachable via gw on interface) and thus it is better to rely on
   prefsrc of route associated with this  interface when sending ICMP in
   ip_route_output_slow() that is already done and works correctly.

Using iph->saddr is done as contermeasure only used then such sysctl is
activated (default: no).

> 	The other part from patch in inet_select_addr() looks
> correct to me but comes with some price for the arp_solicit()
> and icmp_send() cases, a slowdown that may not be liked by
> others.
Yes, we looking at all interfaces in system and all primary addresses
on.

>
> 	About fib_info_update_nh_saddr: same fib_info can
> be used for different subnets, so we can not check the
> destination. But routes to directly connected hosts
> usually come with prefsrc (proto kernel), so it is not a
> problem.
Yes, I point to this to clarify this to me. Thank you.

-- 
SP5474-RIPE
Sergey Popovich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect ARP behavior when multiple/none IPv4 address assigned to interface
  2012-10-24  7:18   ` Sergey Popovich
@ 2012-10-24  9:06     ` Julian Anastasov
  2012-10-24 10:37       ` Sergey Popovich
  0 siblings, 1 reply; 5+ messages in thread
From: Julian Anastasov @ 2012-10-24  9:06 UTC (permalink / raw)
  To: Sergey Popovich; +Cc: netdev

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3764 bytes --]


	Hello,

On Wed, 24 Oct 2012, Sergey Popovich wrote:

> Julian Anastasov пишет:
> 
> > > # tcpdump -vv -ieth0 -s1500 -nnpe 'arp'
> > > 13:28:57.395181 08:00:27:3b:63:ae>  0a:00:27:00:00:00, ethertype ARP
> > > (0x0806),
> > > length 42: Ethernet (len 6),.
> > > IPv4 (len 4), Request who-has 10.0.1.2 tell 10.10.10.10, length 28
> > 
> > 	What kind of packet triggers ARP request here?
> > May be this IP packet already has saddr=10.10.10.10 ?
> > arp_solicit() when eth0/arp_announce=0 (default) just
> > ensures that this saddr is local. Or it is a forwarding
> > case and inet_select_addr is used? Also, any reason to put
> > addresses on loopback and not on eth0?
> 
> 1. Sorry, from bug report is not clear to undestand how this is reproduced:
>   1.1. on PC1 run ping 10.0.1.1
>   1.2. on Linux Router start arp-probe-bug.bsh as root user.
> So this kind of packet generated after link layer addresses resolved &
> ICMP Echo Request/Reply in progress (with no packet loss).
> 
> 2. No I do not think that saddr=10.10.10.10 in probe.
> arp_solicit calls inet_select_addr() only if saddr=0.0.0.0.
> Call to arp_solicit() made from net/core/neighbour.c neigh_probe()
> which is static and called from neigh_timer_handler() when entry ages
> out and goes to PROBE phase.

	Indeed, it can happen often to see skb=NULL for
empty arp_queue...

> 3. Reason to put addr on loopback (not actually system loopback, but
> linux dummy interface or any other network interface) described in more
> details in my bug report, but for short:
> 
> Suppose we have 10.0.1.0/24 subnet.
> and many customers with single ip address (common for Internet
> Providers) on differend broadcast domains (VLANs).
> 
> Each customer connected using 4(!) IPv4 address using traditional schema:
> 
> 10.0.1.0/30 - Customer 1 (ip: 10.0.1.2,  gw: 10.0.1.1, mask: /30)
> 10.0.1.4/30 - Customer 2 (ip: 10.0.1.6,  gw: 10.0.1.5, mask: /30)
> 10.0.1.8/30 - Customer 3 (ip: 10.0.1.10, gw: 10.0.1.9, mask: /30)
> ...
> 10.0.1.252/30 - Customer 64 (ip: 10.0.1.254, gw: 10.0.1.253, mask: /30
> 
> As can be seen on each connection we waste at least 2 IP address:
>   one for subnet address (all zeros in host part)
>   one for subnet broadcast (all ones in host part)
> 
> More efficiently to use entire subnet with /24 mask and assign to each
> customer one ip from subnet with mask /24.
> 
> 10.0.1.1/24 - Loopback (dummy) on Linux Router. This is gateway address
>               to customers.
> 10.0.1.2/24 - Customer 1
> 10.0.1.3/24 - Customer 2
> 10.0.1.4/24 - Customer 3
> ...
> 10.0.1.254/24 - Customer 253
> 
> This schema called by some network equipment vendors as "ip unnumbered"
> and works in Linux for years (and thus used with proper NICs by many
> small/medium (and even large) ISPs to aggregate broadband customers).

	But I still don't understand what will prevent
this "ip unnumbered" address assignment schema to work on
eth0/1/2 instead of using many dummy interfaces, they are
only a place to put addresses, it seems.

	dummy module is usually used as blackhole for
traffic or to hide addresses from other interfaces with
some sysctl interface flags.

	For example, can it work in this way?:

eth0: addr 10.0.1.1/24
ip route 10.0.1.2/32 dev eth0 src 10.0.1.1

eth1: addr 10.0.1.1/24
ip route 10.0.1.3/32 dev eth1 src 10.0.1.1

eth2: addr 10.0.2.1/24
ip route 10.0.2.2/32 dev eth2 src 10.0.2.1

	By this way we have subnet on every device and
we can prefer local IP from such subnet in inet_select_addr.
May be arp_ignore=1/2 and arp_announce=1/2 can help here
to put the needed restrictions, i.e. we should not expose
addresses from other devices. It should not cause problem
for proxy_arp because we have more specific /32 routes.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect ARP behavior when multiple/none IPv4 address assigned to interface
  2012-10-24  9:06     ` Julian Anastasov
@ 2012-10-24 10:37       ` Sergey Popovich
  0 siblings, 0 replies; 5+ messages in thread
From: Sergey Popovich @ 2012-10-24 10:37 UTC (permalink / raw)
  To: netdev

Julian Anastasov пишет:

> 	dummy module is usually used as blackhole for
> traffic or to hide addresses from other interfaces with
> some sysctl interface flags.
>
> 	For example, can it work in this way?:
>
> eth0: addr 10.0.1.1/24
> ip route 10.0.1.2/32 dev eth0 src 10.0.1.1
>
> eth1: addr 10.0.1.1/24
> ip route 10.0.1.3/32 dev eth1 src 10.0.1.1
>
> eth2: addr 10.0.2.1/24
> ip route 10.0.2.2/32 dev eth2 src 10.0.2.1
>
> 	By this way we have subnet on every device and
> we can prefer local IP from such subnet in inet_select_addr.
> May be arp_ignore=1/2 and arp_announce=1/2 can help here
> to put the needed restrictions, i.e. we should not expose
> addresses from other devices. It should not cause problem
> for proxy_arp because we have more specific /32 routes.
>

Yes, just apply proposed configuration to lab schema.

Everything works as expected with no extra arp_ignore/arp_announce
configuration. Even if I add second primary address 192.168.1.1/24 to 
eth2, and introduce pc4 in same broadcast domain as pc3 (eth2).

Well, configuration with 3000 subinterfaces looks worse, but it works
with no extra patches/configuration.

Thank you for your help.

-- 
SP5474-RIPE
Sergey Popovich

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-10-24 10:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-23 11:28 Incorrect ARP behavior when multiple/none IPv4 address assigned to interface Sergey Popovich
2012-10-24  0:15 ` Julian Anastasov
2012-10-24  7:18   ` Sergey Popovich
2012-10-24  9:06     ` Julian Anastasov
2012-10-24 10:37       ` Sergey Popovich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).