From: Jonathan Toppins <jtoppins@redhat.com>
To: Roopa Prabhu <roopa@cumulusnetworks.com>, netdev@vger.kernel.org
Cc: davem@davemloft.net, stephen@networkplumber.org,
nikolay@cumulusnetworks.com, tgraf@suug.ch,
hannes@stressinduktion.org, jbenc@redhat.com, pshelar@ovn.org,
dsa@cumulusnetworks.com, hadi@mojatatu.com
Subject: Re: [PATCH net-next 2/5] vxlan: support fdb and learning in COLLECT_METADATA mode
Date: Tue, 31 Jan 2017 18:37:20 -0500 [thread overview]
Message-ID: <7e4844ec-df5e-6140-c2f7-281619616416@redhat.com> (raw)
In-Reply-To: <1485842235-32343-3-git-send-email-roopa@cumulusnetworks.com>
On 01/31/2017 12:57 AM, Roopa Prabhu wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> Vxlan COLLECT_METADATA mode today solves the per-vni netdev
> scalability problem in l3 networks. It expects all forwarding
> information to be present in dst_metadata. This patch series
> enhances collect metadata mode to include the case where only
> vni is present in dst_metadata, and the vxlan driver can then use
> the rest of the forwarding information datbase to make forwarding
> decisions. There is no change to default COLLECT_METADATA
> behaviour. These changes only apply to COLLECT_METADATA when
> used with the bridging use-case with a special dst_metadata
> tunnel info flag (eg: where vxlan device is part of a bridge).
> For all this to work, the vxlan driver will need to now support a
> single fdb table hashed by mac + vni. This series essentially makes
> this happen.
>
> use-case and workflow:
> vxlan collect metadata device participates in bridging vlan
> to vn-segments. Bridge driver above the vxlan device,
> sends the vni corresponding to the vlan in the dst_metadata.
> vxlan driver will lookup forwarding database with (mac + vni)
> for the required remote destination information to forward the
> packet.
>
> Changes introduced by this patch:
> - allow learning and forwarding database state in vxlan netdev in
> COLLECT_METADATA mode. Current behaviour is not changed
> by default. tunnel info flag IP_TUNNEL_INFO_BRIDGE is used
> to support the new bridge friendly mode.
> - A single fdb table hashed by (mac, vni) to allow fdb entries with
> multiple vnis in the same fdb table
> - rx path already has the vni
> - tx path expects a vni in the packet with dst_metadata
> - prior to this series, fdb remote_dsts carried remote vni and
> the vxlan device carrying the fdb table represented the
> source vni. With the vxlan device now representing multiple vnis,
> this patch adds a src vni attribute to the fdb entry. The remote
> vni already uses NDA_VNI attribute. This patch introduces
> NDA_SRC_VNI netlink attribute to represent the src vni in a multi
> vni fdb table.
>
> iproute2 example (patched and pruned iproute2 output to just show
> relevant fdb entries):
> example shows same host mac learnt on two vni's.
>
> before (netdev per vni):
> $bridge fdb show | grep "00:02:00:00:00:03"
> 00:02:00:00:00:03 dev vxlan1001 dst 12.0.0.8 self
> 00:02:00:00:00:03 dev vxlan1000 dst 12.0.0.8 self
>
> after this patch with collect metadata in bridged mode (single netdev):
> $bridge fdb show | grep "00:02:00:00:00:03"
> 00:02:00:00:00:03 dev vxlan0 src_vni 1001 dst 12.0.0.8 self
> 00:02:00:00:00:03 dev vxlan0 src_vni 1000 dst 12.0.0.8 self
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> drivers/net/vxlan.c | 211 +++++++++++++++++++++++++---------------
> include/uapi/linux/neighbour.h | 1 +
> 2 files changed, 136 insertions(+), 76 deletions(-)
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index 19b1653..b80c405 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -57,6 +57,8 @@
>
> static const u8 all_zeros_mac[ETH_ALEN + 2];
>
> +static u32 fdb_salt __read_mostly;
> +
> static int vxlan_sock_add(struct vxlan_dev *vxlan);
>
> /* per-network namespace private data for this module */
> @@ -75,6 +77,7 @@ struct vxlan_fdb {
> struct list_head remotes;
> u8 eth_addr[ETH_ALEN];
> u16 state; /* see ndm_state */
> + __be32 vni;
> u8 flags; /* see ndm_flags */
> };
>
> @@ -302,6 +305,10 @@ static int vxlan_fdb_info(struct sk_buff *skb, struct vxlan_dev *vxlan,
> if (rdst->remote_vni != vxlan->default_dst.remote_vni &&
> nla_put_u32(skb, NDA_VNI, be32_to_cpu(rdst->remote_vni)))
> goto nla_put_failure;
> + if ((vxlan->flags & VXLAN_F_COLLECT_METADATA) && fdb->vni &&
> + nla_put_u32(skb, NDA_SRC_VNI,
> + be32_to_cpu(fdb->vni)))
> + goto nla_put_failure;
> if (rdst->remote_ifindex &&
> nla_put_u32(skb, NDA_IFINDEX, rdst->remote_ifindex))
> goto nla_put_failure;
> @@ -400,34 +407,51 @@ static u32 eth_hash(const unsigned char *addr)
> return hash_64(value, FDB_HASH_BITS);
> }
>
> +static u32 eth_vni_hash(const unsigned char *addr, __be32 vni)
> +{
> + /* use 1 byte of OUI and 3 bytes of NIC */
> + u32 key = get_unaligned((u32 *)(addr + 2));
> +
> + return jhash_2words(key, vni, fdb_salt) & (FDB_HASH_SIZE - 1);
Not seeing where fdb_salt gets set to anything, why not just use a
constant zero here?
> +}
> +
> /* Hash chain to use given mac address */
> static inline struct hlist_head *vxlan_fdb_head(struct vxlan_dev *vxlan,
> - const u8 *mac)
> + const u8 *mac, __be32 vni)
> {
> - return &vxlan->fdb_head[eth_hash(mac)];
> + if (vxlan->flags & VXLAN_F_COLLECT_METADATA)
> + return &vxlan->fdb_head[eth_vni_hash(mac, vni)];
> + else
> + return &vxlan->fdb_head[eth_hash(mac)];
> }
>
next prev parent reply other threads:[~2017-01-31 23:37 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-31 5:57 [PATCH net-next 0/5] bridge: per vlan dst_metadata support Roopa Prabhu
2017-01-31 5:57 ` [PATCH net-next 1/5] ip_tunnels: new IP_TUNNEL_INFO_BRIDGE flag for ip_tunnel_info mode Roopa Prabhu
2017-01-31 5:57 ` [PATCH net-next 2/5] vxlan: support fdb and learning in COLLECT_METADATA mode Roopa Prabhu
2017-01-31 23:37 ` Jonathan Toppins [this message]
2017-02-01 3:38 ` Roopa Prabhu
2017-01-31 5:57 ` [PATCH net-next 3/5] bridge: uapi: add per vlan tunnel info Roopa Prabhu
2017-01-31 5:57 ` [PATCH net-next 4/5] bridge: per vlan dst_metadata netlink support Roopa Prabhu
2017-01-31 7:12 ` kbuild test robot
2017-01-31 9:34 ` kbuild test robot
2017-01-31 5:57 ` [PATCH net-next 5/5] bridge: vlan dst_metadata hooks in ingress and egress paths Roopa Prabhu
2017-01-31 12:52 ` kbuild test robot
2017-01-31 15:38 ` Roopa Prabhu
2017-01-31 16:41 ` [PATCH net-next 0/5] bridge: per vlan dst_metadata support Stephen Hemminger
2017-01-31 20:43 ` Roopa Prabhu
2017-02-01 16:35 ` Stephen Hemminger
2017-02-01 19:12 ` Roopa Prabhu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e4844ec-df5e-6140-c2f7-281619616416@redhat.com \
--to=jtoppins@redhat.com \
--cc=davem@davemloft.net \
--cc=dsa@cumulusnetworks.com \
--cc=hadi@mojatatu.com \
--cc=hannes@stressinduktion.org \
--cc=jbenc@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=pshelar@ovn.org \
--cc=roopa@cumulusnetworks.com \
--cc=stephen@networkplumber.org \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).