From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Ayres Subject: Re: ARP cache problems / slow connect times in routed mode - Bug #596 opened Date: Sat, 01 Apr 2006 12:12:13 -0500 Message-ID: <442EB46D.9040407@tektonic.net> References: <442D95E6.3030602@tektonic.net> <442EA4D1.1010600@tektonic.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Keir Fraser wrote: > > On 1 Apr 2006, at 17:05, Matt Ayres wrote: > >>> A user of mine has debugged this issue for me. It seems a Xen guest >>> in routed mode wants to arp cache any host it connects to with the >>> MAC address FE:FF:FF:FF:FF:FF. The user also identified long >>> connection times due to this. While a remote host is in the arp cache >>> connection times are fast (30ms or so), when it is not it can be well >>> over 1000ms. They have provided me the tcpdump output that proves >>> this. They also proved it is due to the ARP cache by statically >>> adding a remote host to the ARP cache and noting that connection >>> times are very low. >>> Full debugging information is attached to the bug. >>> Bug URL: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=596 >> >> I have assigned this by to myself and marked it as INVALID. It >> appears to be specific to CentOS / Fedora and my specific setup. > > We'll be interested to learn the full details if you manage to work out > what's going on. :-) > I know exactly what went wrong. I chose to use 169.254.1.1 as the IP to assign to my vif interfaces. Inside the guest a static route is added for 169.254.1.1/24 via eth0 and then a default gateway to 169.254.1.1. I chose this as various proxy ARP howto's use it and it is reserved "link local" space, which made sense. CentOS (RHEL) / Fedora add a static route for 169.254.0.0/18 for DHCP purposes. I see no reason why, it's not required by any other distribution and removing it doesn't make DHCP not work. Anyhow, it appears having the finer-grained /24 route was causing all remote IP's to be cached in the ARP table as local. Removing my /24 static route fixes everything and causes only 169.254.1.1 to be in the ARP cache. Perhaps the community can enlighten me, who is in the wrong here, RedHat or I? We support many other distributions (Gentoo, Debian, Ubuntu, Mandriva/Mandrake, Slackware) and no others want to add the link local network as a static route. The other oddity is why does having the /24 statically routed along with the /18 cause any IP on the internet to be added to the ARP cache? That part right there is what is most confusing to myself. I fixed it, but I'm far from completely understanding it. Thank you, Matt Ayres