Ethernet Bridge development
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <razor@blackwall.org>
To: Ido Schimmel <idosch@nvidia.com>,
	netdev@vger.kernel.org, bridge@lists.linux-foundation.org
Cc: petrm@nvidia.com, mlxsw@nvidia.com, edumazet@google.com,
	roopa@nvidia.com, kuba@kernel.org, pabeni@redhat.com,
	davem@davemloft.net
Subject: Re: [Bridge] [PATCH net-next 09/11] vxlan: Add MDB data path support
Date: Tue, 14 Mar 2023 14:35:08 +0200	[thread overview]
Message-ID: <e695257a-58e2-c676-95cd-77df5c2b68cf@blackwall.org> (raw)
In-Reply-To: <20230313145349.3557231-10-idosch@nvidia.com>

On 13/03/2023 16:53, Ido Schimmel wrote:
> Integrate MDB support into the Tx path of the VXLAN driver, allowing it
> to selectively forward IP multicast traffic according to the matched MDB
> entry.
> 
> If MDB entries are configured (i.e., 'VXLAN_F_MDB' is set) and the
> packet is an IP multicast packet, perform up to three different lookups
> according to the following priority:
> 
> 1. For an (S, G) entry, using {Source VNI, Source IP, Destination IP}.
> 2. For a (*, G) entry, using {Source VNI, Destination IP}.
> 3. For the catchall MDB entry (0.0.0.0 or ::), using the source VNI.
> 
> The catchall MDB entry is similar to the catchall FDB entry
> (00:00:00:00:00:00) that is currently used to transmit BUM (broadcast,
> unknown unicast and multicast) traffic. However, unlike the catchall FDB
> entry, this entry is only used to transmit unregistered IP multicast
> traffic that is not link-local. Therefore, when configured, the catchall
> FDB entry will only transmit BULL (broadcast, unknown unicast,
> link-local multicast) traffic.
> 
> The catchall MDB entry is useful in deployments where inter-subnet
> multicast forwarding is used and not all the VTEPs in a tenant domain
> are members in all the broadcast domains. In such deployments it is
> advantageous to transmit BULL (broadcast, unknown unicast and link-local
> multicast) and unregistered IP multicast traffic on different tunnels.
> If the same tunnel was used, a VTEP only interested in IP multicast
> traffic would also pull all the BULL traffic and drop it as it is not a
> member in the originating broadcast domain [1].
> 
> If the packet did not match an MDB entry (or if the packet is not an IP
> multicast packet), return it to the Tx path, allowing it to be forwarded
> according to the FDB.
> 
> If the packet did match an MDB entry, forward it to the associated
> remote VTEPs. However, if the entry is a (*, G) entry and the associated
> remote is in INCLUDE mode, then skip over it as the source IP is not in
> its source list (otherwise the packet would have matched on an (S, G)
> entry). Similarly, if the associated remote is marked as BLOCKED (can
> only be set on (S, G) entries), then skip over it as well as the remote
> is in EXCLUDE mode and the source IP is in its source list.
> 
> [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-2.6
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c    |  15 ++++
>  drivers/net/vxlan/vxlan_mdb.c     | 114 ++++++++++++++++++++++++++++++
>  drivers/net/vxlan/vxlan_private.h |   6 ++
>  3 files changed, 135 insertions(+)
> 
[snip]> diff --git a/drivers/net/vxlan/vxlan_mdb.c b/drivers/net/vxlan/vxlan_mdb.c
> index b32b1fb4a74a..ea63c5178718 100644
> --- a/drivers/net/vxlan/vxlan_mdb.c
> +++ b/drivers/net/vxlan/vxlan_mdb.c
> @@ -1298,6 +1298,120 @@ int vxlan_mdb_del(struct net_device *dev, struct nlattr *tb[],
>  	return err;
>  }
>  
> +struct vxlan_mdb_entry *vxlan_mdb_entry_skb_get(struct vxlan_dev *vxlan,
> +						struct sk_buff *skb,
> +						__be32 src_vni)
> +{
> +	struct vxlan_mdb_entry *mdb_entry;
> +	struct vxlan_mdb_entry_key group;
> +
> +	if (!is_multicast_ether_addr(eth_hdr(skb)->h_dest) ||
> +	    is_broadcast_ether_addr(eth_hdr(skb)->h_dest))
> +		return NULL;
> +
> +	/* When not in collect metadata mode, 'src_vni' is zero, but MDB
> +	 * entries are stored with the VNI of the VXLAN device.
> +	 */
> +	if (!(vxlan->cfg.flags & VXLAN_F_COLLECT_METADATA))
> +		src_vni = vxlan->default_dst.remote_vni;
> +
> +	memset(&group, 0, sizeof(group));
> +	group.vni = src_vni;
> +
> +	switch (ntohs(skb->protocol)) {

drop the ntohs and..

> +	case ETH_P_IP:

htons(ETH_P_IP)

> +		if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> +			return NULL;
> +		group.dst.sa.sa_family = AF_INET;
> +		group.dst.sin.sin_addr.s_addr = ip_hdr(skb)->daddr;
> +		group.src.sa.sa_family = AF_INET;
> +		group.src.sin.sin_addr.s_addr = ip_hdr(skb)->saddr;
> +		break;
> +#if IS_ENABLED(CONFIG_IPV6)
> +	case ETH_P_IPV6:

htons(ETH_P_IPV6)

> +		if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
> +			return NULL;
> +		group.dst.sa.sa_family = AF_INET6;
> +		group.dst.sin6.sin6_addr = ipv6_hdr(skb)->daddr;
> +		group.src.sa.sa_family = AF_INET6;
> +		group.src.sin6.sin6_addr = ipv6_hdr(skb)->saddr;
> +		break;
> +#endif
> +	default:
> +		return NULL;
> +	}
> +
> +	mdb_entry = vxlan_mdb_entry_lookup(vxlan, &group);
> +	if (mdb_entry)
> +		return mdb_entry;
> +
> +	memset(&group.src, 0, sizeof(group.src));
> +	mdb_entry = vxlan_mdb_entry_lookup(vxlan, &group);
> +	if (mdb_entry)
> +		return mdb_entry;
> +
> +	/* No (S, G) or (*, G) found. Look up the all-zeros entry, but only if
> +	 * the destination IP address is not link-local multicast since we want
> +	 * to transmit such traffic together with broadcast and unknown unicast
> +	 * traffic.
> +	 */
> +	switch (ntohs(skb->protocol)) {
> +	case ETH_P_IP:

ditto

> +		if (ipv4_is_local_multicast(group.dst.sin.sin_addr.s_addr))
> +			return NULL;
> +		group.dst.sin.sin_addr.s_addr = 0;
> +		break;
> +#if IS_ENABLED(CONFIG_IPV6)
> +	case ETH_P_IPV6:

ditto

> +		if (ipv6_addr_type(&group.dst.sin6.sin6_addr) &
> +		    IPV6_ADDR_LINKLOCAL)
> +			return NULL;
> +		memset(&group.dst.sin6.sin6_addr, 0,
> +		       sizeof(group.dst.sin6.sin6_addr));
> +		break;
> +#endif
> +	default:
> +		return NULL;
> +	}
> +
> +	return vxlan_mdb_entry_lookup(vxlan, &group);
> +}
> +
[snip]


  reply	other threads:[~2023-03-14 12:35 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-13 14:53 [Bridge] [PATCH net-next 00/11] vxlan: Add MDB support Ido Schimmel
2023-03-13 14:53 ` [Bridge] [PATCH net-next 01/11] net: Add MDB net device operations Ido Schimmel
2023-03-14 11:41   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 02/11] bridge: mcast: Implement " Ido Schimmel
2023-03-14 11:43   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 03/11] rtnetlink: bridge: mcast: Move MDB handlers out of bridge driver Ido Schimmel
2023-03-14 11:51   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 04/11] rtnetlink: bridge: mcast: Relax group address validation in common code Ido Schimmel
2023-03-14 11:51   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 05/11] vxlan: Move address helpers to private headers Ido Schimmel
2023-03-14 11:52   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 06/11] vxlan: Expose vxlan_xmit_one() Ido Schimmel
2023-03-14 11:52   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 07/11] vxlan: mdb: Add MDB control path support Ido Schimmel
2023-03-14 12:24   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 08/11] vxlan: mdb: Add an internal flag to indicate MDB usage Ido Schimmel
2023-03-14 12:24   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 09/11] vxlan: Add MDB data path support Ido Schimmel
2023-03-14 12:35   ` Nikolay Aleksandrov [this message]
2023-03-15  7:58     ` Ido Schimmel
2023-03-13 14:53 ` [Bridge] [PATCH net-next 10/11] vxlan: Enable MDB support Ido Schimmel
2023-03-14 12:35   ` Nikolay Aleksandrov
2023-03-13 14:53 ` [Bridge] [PATCH net-next 11/11] selftests: net: Add VXLAN MDB test Ido Schimmel
2023-03-14 12:35   ` Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e695257a-58e2-c676-95cd-77df5c2b68cf@blackwall.org \
    --to=razor@blackwall.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    --cc=roopa@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox