public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: David Bauer <mail@david-bauer.net>
Cc: Amit Cohen <amcohen@nvidia.com>, netdev@vger.kernel.org
Subject: Re: VxLAN learning creating broadcast FDB entry
Date: Wed, 21 Feb 2024 16:41:28 +0200	[thread overview]
Message-ID: <ZdYLmDx7Od6-ouBq@shredder> (raw)
In-Reply-To: <15ee0cc7-9252-466b-8ce7-5225d605dde8@david-bauer.net>

Hi,

On Tue, Feb 20, 2024 at 07:33:54PM +0100, David Bauer wrote:
> Hi,
> 
> we are using a VxLAN overlay Network to encapsulate batman-adv Layer 2
> Routing. This distance-vector protocol relies on originator messages
> broadcasted to all adjacent nodes in a fixed interval.
> 
> Over the course of the last couple weeks, I've discovered the nodes of this
> network to lose connection to all adjacent nodes except for one, which
> retained connectivity to all the others.
> 
> So there's a Node A which has connection to nodes [B,C,D] but [B,C,D] have
> no connection to each other, despite being in the same Layer 2 network which
> contains the Layer2 Domain encapsulated in VxLAN.
> 
> After some digging, I've found out the VxLAN forwarding database on nodes
> [B,C,D] contains an entry for the broadcast address of Node A while Node A
> does not contain this entry:
> 
> $ bridge fdb show dev vx_mesh_other | grep dst
> 00:00:00:00:00:00 dst ff02::15c via eth0 self permanent
> 72:de:3c:0b:30:5c dst fe80::70de:3cff:fe0b:305c via eth0 self
> 66:e8:61:a3:e9:ec dst fe80::64e8:61ff:fea3:e9ec via eth0 self
> ff:ff:ff:ff:ff:ff dst fe80::dc12:d5ff:fe33:e194 via eth0 self
> fa:64:ce:3e:7b:24 dst fe80::f864:ceff:fe3e:7b24 via eth0 self
> [...]
> 
> I've looked into the VxLAN code and discovered the snooping code creates FDB
> entries regardless whether the source-address read is a multicast address.
> 
> When reading the specification in RFC7348, chapter 4 suggests
> 
> > Here, the association of VM's MAC to VTEP's IP address
> > is discovered via source-address learning.  Multicast
> > is used for carrying unknown destination, broadcast,
> > and multicast frames.
> 
> I understand this as multicast addresses should not be learned. However, by
> sending a VxLAN frame which contains the broadcast address as the
> encapsulated source-address to a Linux machine, the Kernel creates an entry
> for the broadcast address and the IPv6 source-address the VxLAN packet was
> encapsulated in.
> 
> This subsequently breaks broadcast operation within the VxLAN with all
> broadcast traffic being directed to a single node. So a node within the
> overlay network can break said network this way.
> 
> Is this behavior of the Linux kernel intended and in accordance with the
> specification or shall we avoid learning group Ethernet addresses in the
> FDB?
> 
> I've applied a patch which avoids learning such addresses in vxlan_snoop and
> it mitigates this behavior for me. Shall i send such a patch upstream?
> [0][1]

It's not clear to me why the VXLAN driver does not drop packets with an
invalid Ethernet source address like the bridge driver is doing. See the
second check in br_handle_frame(). I would suggest this instead:

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 386cbe4d3327..936c47743318 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1612,7 +1612,8 @@ static bool vxlan_set_mac(struct vxlan_dev *vxlan,
        skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
 
        /* Ignore packet loops (and multicast echo) */
-       if (ether_addr_equal(eth_hdr(skb)->h_source, vxlan->dev->dev_addr))
+       if (ether_addr_equal(eth_hdr(skb)->h_source, vxlan->dev->dev_addr) ||
+           !is_valid_ether_addr(eth_hdr(skb)->h_source))
                return false;
 
        /* Get address from the outer IP header */

But I also think it's worth trying to understand which application is
generating these packets on the other end as it might expose an even
bigger issue. You can try running something like this (ugly) bpftrace
script:

#!/bin/bpftrace

k:vxlan_xmit
{
        $skb = (struct sk_buff *) arg0;
        $eth = (struct ethhdr *) $skb->data;

        if ($eth->h_source[0] == 0xff && $eth->h_source[1] == 0xff &&
            $eth->h_source[2] == 0xff && $eth->h_source[3] == 0xff &&
            $eth->h_source[4] == 0xff && $eth->h_source[5] == 0xff) {
                @[comm, kstack()] = count();
        }
}

      reply	other threads:[~2024-02-21 14:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-20 18:33 VxLAN learning creating broadcast FDB entry David Bauer
2024-02-21 14:41 ` Ido Schimmel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZdYLmDx7Od6-ouBq@shredder \
    --to=idosch@nvidia.com \
    --cc=amcohen@nvidia.com \
    --cc=mail@david-bauer.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox