netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] bridge: Add port flap detection
@ 2014-05-04 21:29 Jon Maxwell
  2014-05-04 22:53 ` Stephen Hemminger
  2014-05-05 12:48 ` Sergei Shtylyov
  0 siblings, 2 replies; 5+ messages in thread
From: Jon Maxwell @ 2014-05-04 21:29 UTC (permalink / raw)
  To: netdev
  Cc: stephen, davem, makita.toshiaki, vyasevic, bridge, linux-kernel,
	pirko, jmaxwell

There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge flapping ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will detect port flapping and log a message so that this condition can be detected earlier.

Signed-off-by: Jon Maxwell <jmaxwell@redhat.com>
---
 net/bridge/br_fdb.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 9203d5a..c08607b 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -507,6 +507,13 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 					source->dev->name);
 		} else {
 			/* fastpath: update of existing entry */
+			if (source->port_no != fdb->dst->port_no &&
+				net_ratelimit())
+				br_warn(br, "Port flapping detected source entry dev = %s mac = %pM, port_no = %d\n existing entry dev = %s mac = %pM, port_no = %d\n",
+					source->dev->name,
+					addr, source->port_no,
+					fdb->dst->dev->name, addr,
+					fdb->dst->port_no);
 			fdb->dst = source;
 			fdb->updated = jiffies;
 			if (unlikely(added_by_user))
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] bridge: Add port flap detection
  2014-05-04 21:29 [PATCH net] bridge: Add port flap detection Jon Maxwell
@ 2014-05-04 22:53 ` Stephen Hemminger
  2014-05-05  4:06   ` Alexei Starovoitov
  2014-05-05 12:48 ` Sergei Shtylyov
  1 sibling, 1 reply; 5+ messages in thread
From: Stephen Hemminger @ 2014-05-04 22:53 UTC (permalink / raw)
  To: Jon Maxwell
  Cc: netdev, davem, makita.toshiaki, vyasevic, bridge, linux-kernel,
	pirko, jmaxwell

On Mon,  5 May 2014 07:29:34 +1000
Jon Maxwell <jmaxwell37@gmail.com> wrote:

> There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge flapping ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will detect port flapping and log a message so that this condition can be detected earlier.
> 
> Signed-off-by: Jon Maxwell <jmaxwell@redhat.com>
> ---
>  net/bridge/br_fdb.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 9203d5a..c08607b 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -507,6 +507,13 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>  					source->dev->name);
>  		} else {
>  			/* fastpath: update of existing entry */
> +			if (source->port_no != fdb->dst->port_no &&
> +				net_ratelimit())
> +				br_warn(br, "Port flapping detected source entry dev = %s mac = %pM, port_no = %d\n existing entry dev = %s mac = %pM, port_no = %d\n",
> +					source->dev->name,
> +					addr, source->port_no,
> +					fdb->dst->dev->name, addr,
> +					fdb->dst->port_no);
>  			fdb->dst = source;
>  			fdb->updated = jiffies;
>  			if (unlikely(added_by_user))

Ok, but please shorten the message to a single line without excess wordage.
Plus flapping to mean means link going up and down. Maybe use same message
as BSD?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] bridge: Add port flap detection
  2014-05-04 22:53 ` Stephen Hemminger
@ 2014-05-05  4:06   ` Alexei Starovoitov
  2014-05-05  6:22     ` Jon Maxwell
  0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2014-05-05  4:06 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jon Maxwell, netdev@vger.kernel.org, David S. Miller,
	makita.toshiaki, vyasevic, bridge, linux-kernel@vger.kernel.org,
	pirko, jmaxwell

On Sun, May 4, 2014 at 3:53 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon,  5 May 2014 07:29:34 +1000
> Jon Maxwell <jmaxwell37@gmail.com> wrote:
>
>> There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge flapping ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will detect port flapping and log a message so that this condition can be detected earlier.
>>
>> Signed-off-by: Jon Maxwell <jmaxwell@redhat.com>
>> ---
>>  net/bridge/br_fdb.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
>> index 9203d5a..c08607b 100644
>> --- a/net/bridge/br_fdb.c
>> +++ b/net/bridge/br_fdb.c
>> @@ -507,6 +507,13 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>>                                       source->dev->name);
>>               } else {
>>                       /* fastpath: update of existing entry */
>> +                     if (source->port_no != fdb->dst->port_no &&
>> +                             net_ratelimit())
>> +                             br_warn(br, "Port flapping detected source entry dev = %s mac = %pM, port_no = %d\n existing entry dev = %s mac = %pM, port_no = %d\n",
>> +                                     source->dev->name,
>> +                                     addr, source->port_no,
>> +                                     fdb->dst->dev->name, addr,
>> +                                     fdb->dst->port_no);
>>                       fdb->dst = source;
>>                       fdb->updated = jiffies;
>>                       if (unlikely(added_by_user))
>
> Ok, but please shorten the message to a single line without excess wordage.
> Plus flapping to mean means link going up and down. Maybe use same message
> as BSD?

Isn't this normal mac move? Any message will be confusing.
VMs can spoof their src macs and trigger this warning.
I don't think it's worth adding it just to debug the learning on the
external interface.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] bridge: Add port flap detection
  2014-05-05  4:06   ` Alexei Starovoitov
@ 2014-05-05  6:22     ` Jon Maxwell
  0 siblings, 0 replies; 5+ messages in thread
From: Jon Maxwell @ 2014-05-05  6:22 UTC (permalink / raw)
  To: Alexei Starovoitov, Stephen Hemminger
  Cc: Jon Maxwell, netdev, David S. Miller, makita toshiaki, vyasevic,
	bridge, linux-kernel, Jiri Pirko

I acknowledge that if a VM was moved or reconfigured with the MAC address of another machine it may generate one message. However in the cases we encountered the MAC address was toggling between the internal and external network constantly as the arp request was erroneously reflected back. We actually provided a kernel with this patch to trouble-shoot the issue. I propose that we change the patch to print the following message instead? That way a single message will be informational but multiple messages will indicate something is wrong.

 br_warn(br, "FDB port changed: source dev = %s mac = %pM, port_no = %d, existing dev = %s mac = %pM, port_no = %d\n",

----- Original Message -----
From: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>
To: "Stephen Hemminger" <stephen@networkplumber.org>
Cc: "Jon Maxwell" <jmaxwell37@gmail.com>, netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>, "makita toshiaki" <makita.toshiaki@lab.ntt.co.jp>, vyasevic@redhat.com, bridge@lists.linux-foundation.org, linux-kernel@vger.kernel.org, pirko@redhat.com, jmaxwell@redhat.com
Sent: Monday, May 5, 2014 2:06:37 PM
Subject: Re: [PATCH net] bridge: Add port flap detection

On Sun, May 4, 2014 at 3:53 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon,  5 May 2014 07:29:34 +1000
> Jon Maxwell <jmaxwell37@gmail.com> wrote:
>
>> There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge flapping ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will detect port flapping and log a message so that this condition can be detected earlier.
>>
>> Signed-off-by: Jon Maxwell <jmaxwell@redhat.com>
>> ---
>>  net/bridge/br_fdb.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
>> index 9203d5a..c08607b 100644
>> --- a/net/bridge/br_fdb.c
>> +++ b/net/bridge/br_fdb.c
>> @@ -507,6 +507,13 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>>                                       source->dev->name);
>>               } else {
>>                       /* fastpath: update of existing entry */
>> +                     if (source->port_no != fdb->dst->port_no &&
>> +                             net_ratelimit())
>> +                             br_warn(br, "Port flapping detected source entry dev = %s mac = %pM, port_no = %d\n existing entry dev = %s mac = %pM, port_no = %d\n",
>> +                                     source->dev->name,
>> +                                     addr, source->port_no,
>> +                                     fdb->dst->dev->name, addr,
>> +                                     fdb->dst->port_no);
>>                       fdb->dst = source;
>>                       fdb->updated = jiffies;
>>                       if (unlikely(added_by_user))
>
> Ok, but please shorten the message to a single line without excess wordage.
> Plus flapping to mean means link going up and down. Maybe use same message
> as BSD?

Isn't this normal mac move? Any message will be confusing.
VMs can spoof their src macs and trigger this warning.
I don't think it's worth adding it just to debug the learning on the
external interface.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] bridge: Add port flap detection
  2014-05-04 21:29 [PATCH net] bridge: Add port flap detection Jon Maxwell
  2014-05-04 22:53 ` Stephen Hemminger
@ 2014-05-05 12:48 ` Sergei Shtylyov
  1 sibling, 0 replies; 5+ messages in thread
From: Sergei Shtylyov @ 2014-05-05 12:48 UTC (permalink / raw)
  To: Jon Maxwell, netdev
  Cc: stephen, davem, makita.toshiaki, vyasevic, bridge, linux-kernel,
	pirko, jmaxwell

Hello.

On 05/05/2014 01:29 AM, Jon Maxwell wrote:

> There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge flapping ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will detect port flapping and log a message so that this condition can be detected earlier.

    Please wrap your changelog at 80 columns or less.

> Signed-off-by: Jon Maxwell <jmaxwell@redhat.com>

    You probably want the same RedHat email in the patch autorhship record, so 
you need to insert the below line before the changelog:

From: Jon Maxwell <jmaxwell@redhat.com>

> ---
>   net/bridge/br_fdb.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 9203d5a..c08607b 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -507,6 +507,13 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>   					source->dev->name);
>   		} else {
>   			/* fastpath: update of existing entry */
> +			if (source->port_no != fdb->dst->port_no &&
> +				net_ratelimit())
> +				br_warn(br, "Port flapping detected source entry dev = %s mac = %pM, port_no = %d\n existing entry dev = %s mac = %pM, port_no = %d\n",
> +					source->dev->name,
> +					addr, source->port_no,
> +					fdb->dst->dev->name, addr,

    Why print the same MAC address twice?

> +					fdb->dst->port_no);
>   			fdb->dst = source;
>   			fdb->updated = jiffies;
>   			if (unlikely(added_by_user))

WBR, Sergei

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-05 12:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-04 21:29 [PATCH net] bridge: Add port flap detection Jon Maxwell
2014-05-04 22:53 ` Stephen Hemminger
2014-05-05  4:06   ` Alexei Starovoitov
2014-05-05  6:22     ` Jon Maxwell
2014-05-05 12:48 ` Sergei Shtylyov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).