All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Yoshiki Komachi <komachi.yoshiki@gmail.com>
Cc: brouer@redhat.com, "David S. Miller" <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>, Martin KaFai Lau <kafai@fb.com>,
	Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
	Andrii Nakryiko <andriin@fb.com>, KP Singh <kpsingh@chromium.org>,
	Roopa Prabhu <roopa@cumulusnetworks.com>,
	Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
	David Ahern <dsahern@kernel.org>,
	netdev@vger.kernel.org, bridge@lists.linux-foundation.org,
	bpf@vger.kernel.org
Subject: Re: [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP
Date: Fri, 31 Jul 2020 16:15:19 +0200	[thread overview]
Message-ID: <20200731161519.5f413f82@carbon> (raw)
In-Reply-To: <1596170660-5582-4-git-send-email-komachi.yoshiki@gmail.com>


I really appreciate that you are working on adding this helper.
Some comments below.

On Fri, 31 Jul 2020 13:44:20 +0900
Yoshiki Komachi <komachi.yoshiki@gmail.com> wrote:

> diff --git a/samples/bpf/xdp_bridge_kern.c b/samples/bpf/xdp_bridge_kern.c
> new file mode 100644
> index 000000000000..00f802503199
> --- /dev/null
> +++ b/samples/bpf/xdp_bridge_kern.c
> @@ -0,0 +1,129 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 NTT Corp. All Rights Reserved.
> + *
[...]
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_DEVMAP_HASH);
> +	__uint(key_size, sizeof(int));
> +	__uint(value_size, sizeof(int));
> +	__uint(max_entries, 64);
> +} xdp_tx_ports SEC(".maps");
> +
> +static __always_inline int xdp_bridge_proto(struct xdp_md *ctx, u16 br_vlan_proto)
> +{
> +	void *data_end = (void *)(long)ctx->data_end;
> +	void *data = (void *)(long)ctx->data;
> +	struct bpf_fdb_lookup fdb_lookup_params;
> +	struct vlan_hdr *vlan_hdr = NULL;
> +	struct ethhdr *eth = data;
> +	u16 h_proto;
> +	u64 nh_off;
> +	int rc;
> +
> +	nh_off = sizeof(*eth);
> +	if (data + nh_off > data_end)
> +		return XDP_DROP;
> +
> +	__builtin_memset(&fdb_lookup_params, 0, sizeof(fdb_lookup_params));
> +
> +	h_proto = eth->h_proto;
> +
> +	if (unlikely(ntohs(h_proto) < ETH_P_802_3_MIN))
> +		return XDP_PASS;
> +
> +	/* Handle VLAN tagged packet */
> +	if (h_proto == br_vlan_proto) {
> +		vlan_hdr = (void *)eth + nh_off;
> +		nh_off += sizeof(*vlan_hdr);
> +		if ((void *)eth + nh_off > data_end)
> +			return XDP_PASS;
> +
> +		fdb_lookup_params.vlan_id = ntohs(vlan_hdr->h_vlan_TCI) &
> +					VLAN_VID_MASK;
> +	}
> +
> +	/* FIXME: Although Linux bridge provides us with vlan filtering (contains
> +	 * PVID) at ingress, the feature is currently unsupported in this XDP program.
> +	 *
> +	 * Two ideas to realize the vlan filtering are below:
> +	 *   1. usespace daemon monitors bridge vlan events and notifies XDP programs
                   ^^
Typo: usespace -> userspace

> +	 *      of them through BPF maps
> +	 *   2. introduce another bpf helper to retrieve bridge vlan information

The comment appears two times time this file.

> +	 *
> +	 *
> +	 * FIXME: After the vlan filtering, learning feature is required here, but
> +	 * it is currently unsupported as well. If another bpf helper for learning
> +	 * is accepted, the processing could be implemented in the future.
> +	 */
> +
> +	memcpy(&fdb_lookup_params.addr, eth->h_dest, ETH_ALEN);
> +
> +	/* Note: This program definitely takes ifindex of ingress interface as
> +	 * a bridge port. Linux networking devices can be stacked and physical
> +	 * interfaces are not necessarily slaves of bridges (e.g., bonding or
> +	 * vlan devices can be slaves of bridges), but stacked bridge ports are
> +	 * currently unsupported in this program. In such cases, XDP programs
> +	 * should be attached to a lower device in order to process packets with
> +	 * higher speed. Then, a new bpf helper to find upper devices will be
> +	 * required here in the future because they will be registered on FDB
> +	 * in the kernel.
> +	 */
> +	fdb_lookup_params.ifindex = ctx->ingress_ifindex;
> +
> +	rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0);
> +	if (rc != BPF_FDB_LKUP_RET_SUCCESS) {
> +		/* In cases of flooding, XDP_PASS will be returned here */
> +		return XDP_PASS;
> +	}
> +
> +	/* FIXME: Although Linux bridge provides us with vlan filtering (contains
> +	 * untagged policy) at egress as well, the feature is currently unsupported
> +	 * in this XDP program.
> +	 *
> +	 * Two ideas to realize the vlan filtering are below:
> +	 *   1. usespace daemon monitors bridge vlan events and notifies XDP programs
> +	 *      of them through BPF maps
> +	 *   2. introduce another bpf helper to retrieve bridge vlan information
> +	 */

(2nd time the comment appears)

> +

A comment about below bpf_redirect_map() would be good.  Explaining
that we depend on fallback behavior, to let normal bridge code handle
other cases (e.g. flood/broadcast). And also that if lookup fails,
XDP_PASS/fallback also happens.

> +	return bpf_redirect_map(&xdp_tx_ports, fdb_lookup_params.ifindex, XDP_PASS);
> +}
> +
> +SEC("xdp_bridge")
> +int xdp_bridge_prog(struct xdp_md *ctx)
> +{
> +	return xdp_bridge_proto(ctx, 0);
> +}
> +
> +SEC("xdp_8021q_bridge")
> +int xdp_8021q_bridge_prog(struct xdp_md *ctx)
> +{
> +	return xdp_bridge_proto(ctx, htons(ETH_P_8021Q));
> +}
> +
> +SEC("xdp_8021ad_bridge")
> +int xdp_8021ad_bridge_prog(struct xdp_md *ctx)
> +{
> +	return xdp_bridge_proto(ctx, htons(ETH_P_8021AD));
> +}
> +
> +char _license[] SEC("license") = "GPL";


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


WARNING: multiple messages have this Message-ID (diff)
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Yoshiki Komachi <komachi.yoshiki@gmail.com>
Cc: Song Liu <songliubraving@fb.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
	Yonghong Song <yhs@fb.com>,
	Roopa Prabhu <roopa@cumulusnetworks.com>,
	bridge@lists.linux-foundation.org,
	John Fastabend <john.fastabend@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, David Ahern <dsahern@kernel.org>,
	netdev@vger.kernel.org, brouer@redhat.com,
	KP Singh <kpsingh@chromium.org>, Jakub Kicinski <kuba@kernel.org>,
	bpf@vger.kernel.org, Andrii Nakryiko <andriin@fb.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [Bridge] [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP
Date: Fri, 31 Jul 2020 16:15:19 +0200	[thread overview]
Message-ID: <20200731161519.5f413f82@carbon> (raw)
In-Reply-To: <1596170660-5582-4-git-send-email-komachi.yoshiki@gmail.com>


I really appreciate that you are working on adding this helper.
Some comments below.

On Fri, 31 Jul 2020 13:44:20 +0900
Yoshiki Komachi <komachi.yoshiki@gmail.com> wrote:

> diff --git a/samples/bpf/xdp_bridge_kern.c b/samples/bpf/xdp_bridge_kern.c
> new file mode 100644
> index 000000000000..00f802503199
> --- /dev/null
> +++ b/samples/bpf/xdp_bridge_kern.c
> @@ -0,0 +1,129 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 NTT Corp. All Rights Reserved.
> + *
[...]
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_DEVMAP_HASH);
> +	__uint(key_size, sizeof(int));
> +	__uint(value_size, sizeof(int));
> +	__uint(max_entries, 64);
> +} xdp_tx_ports SEC(".maps");
> +
> +static __always_inline int xdp_bridge_proto(struct xdp_md *ctx, u16 br_vlan_proto)
> +{
> +	void *data_end = (void *)(long)ctx->data_end;
> +	void *data = (void *)(long)ctx->data;
> +	struct bpf_fdb_lookup fdb_lookup_params;
> +	struct vlan_hdr *vlan_hdr = NULL;
> +	struct ethhdr *eth = data;
> +	u16 h_proto;
> +	u64 nh_off;
> +	int rc;
> +
> +	nh_off = sizeof(*eth);
> +	if (data + nh_off > data_end)
> +		return XDP_DROP;
> +
> +	__builtin_memset(&fdb_lookup_params, 0, sizeof(fdb_lookup_params));
> +
> +	h_proto = eth->h_proto;
> +
> +	if (unlikely(ntohs(h_proto) < ETH_P_802_3_MIN))
> +		return XDP_PASS;
> +
> +	/* Handle VLAN tagged packet */
> +	if (h_proto == br_vlan_proto) {
> +		vlan_hdr = (void *)eth + nh_off;
> +		nh_off += sizeof(*vlan_hdr);
> +		if ((void *)eth + nh_off > data_end)
> +			return XDP_PASS;
> +
> +		fdb_lookup_params.vlan_id = ntohs(vlan_hdr->h_vlan_TCI) &
> +					VLAN_VID_MASK;
> +	}
> +
> +	/* FIXME: Although Linux bridge provides us with vlan filtering (contains
> +	 * PVID) at ingress, the feature is currently unsupported in this XDP program.
> +	 *
> +	 * Two ideas to realize the vlan filtering are below:
> +	 *   1. usespace daemon monitors bridge vlan events and notifies XDP programs
                   ^^
Typo: usespace -> userspace

> +	 *      of them through BPF maps
> +	 *   2. introduce another bpf helper to retrieve bridge vlan information

The comment appears two times time this file.

> +	 *
> +	 *
> +	 * FIXME: After the vlan filtering, learning feature is required here, but
> +	 * it is currently unsupported as well. If another bpf helper for learning
> +	 * is accepted, the processing could be implemented in the future.
> +	 */
> +
> +	memcpy(&fdb_lookup_params.addr, eth->h_dest, ETH_ALEN);
> +
> +	/* Note: This program definitely takes ifindex of ingress interface as
> +	 * a bridge port. Linux networking devices can be stacked and physical
> +	 * interfaces are not necessarily slaves of bridges (e.g., bonding or
> +	 * vlan devices can be slaves of bridges), but stacked bridge ports are
> +	 * currently unsupported in this program. In such cases, XDP programs
> +	 * should be attached to a lower device in order to process packets with
> +	 * higher speed. Then, a new bpf helper to find upper devices will be
> +	 * required here in the future because they will be registered on FDB
> +	 * in the kernel.
> +	 */
> +	fdb_lookup_params.ifindex = ctx->ingress_ifindex;
> +
> +	rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0);
> +	if (rc != BPF_FDB_LKUP_RET_SUCCESS) {
> +		/* In cases of flooding, XDP_PASS will be returned here */
> +		return XDP_PASS;
> +	}
> +
> +	/* FIXME: Although Linux bridge provides us with vlan filtering (contains
> +	 * untagged policy) at egress as well, the feature is currently unsupported
> +	 * in this XDP program.
> +	 *
> +	 * Two ideas to realize the vlan filtering are below:
> +	 *   1. usespace daemon monitors bridge vlan events and notifies XDP programs
> +	 *      of them through BPF maps
> +	 *   2. introduce another bpf helper to retrieve bridge vlan information
> +	 */

(2nd time the comment appears)

> +

A comment about below bpf_redirect_map() would be good.  Explaining
that we depend on fallback behavior, to let normal bridge code handle
other cases (e.g. flood/broadcast). And also that if lookup fails,
XDP_PASS/fallback also happens.

> +	return bpf_redirect_map(&xdp_tx_ports, fdb_lookup_params.ifindex, XDP_PASS);
> +}
> +
> +SEC("xdp_bridge")
> +int xdp_bridge_prog(struct xdp_md *ctx)
> +{
> +	return xdp_bridge_proto(ctx, 0);
> +}
> +
> +SEC("xdp_8021q_bridge")
> +int xdp_8021q_bridge_prog(struct xdp_md *ctx)
> +{
> +	return xdp_bridge_proto(ctx, htons(ETH_P_8021Q));
> +}
> +
> +SEC("xdp_8021ad_bridge")
> +int xdp_8021ad_bridge_prog(struct xdp_md *ctx)
> +{
> +	return xdp_bridge_proto(ctx, htons(ETH_P_8021AD));
> +}
> +
> +char _license[] SEC("license") = "GPL";


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2020-07-31 14:15 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31  4:44 [RFC PATCH bpf-next 0/3] Add a new bpf helper for FDB lookup Yoshiki Komachi
2020-07-31  4:44 ` [Bridge] " Yoshiki Komachi
2020-07-31  4:44 ` [RFC PATCH bpf-next 1/3] net/bridge: Add new function to access FDB from XDP programs Yoshiki Komachi
2020-07-31  4:44   ` [Bridge] " Yoshiki Komachi
2020-07-31  4:44 ` [RFC PATCH bpf-next 2/3] bpf: Add helper to do forwarding lookups in kernel FDB table Yoshiki Komachi
2020-07-31  4:44   ` [Bridge] " Yoshiki Komachi
2020-07-31 11:52   ` Maciej Fijalkowski
2020-07-31 11:52     ` [Bridge] " Maciej Fijalkowski
2020-08-04  8:44     ` Yoshiki Komachi
2020-08-04  8:44       ` [Bridge] " Yoshiki Komachi
2020-07-31 17:15   ` David Ahern
2020-07-31 17:15     ` [Bridge] " David Ahern
2020-08-04 11:27     ` Yoshiki Komachi
2020-08-04 11:27       ` [Bridge] " Yoshiki Komachi
2020-08-05 16:38       ` David Ahern
2020-08-05 16:38         ` [Bridge] " David Ahern
2020-08-07  8:06         ` Yoshiki Komachi
2020-08-07  8:06           ` [Bridge] " Yoshiki Komachi
2020-07-31 21:12   ` Daniel Borkmann
2020-07-31 21:12     ` [Bridge] " Daniel Borkmann
2020-08-05  4:45     ` Yoshiki Komachi
2020-08-05  4:45       ` [Bridge] " Yoshiki Komachi
2020-07-31  4:44 ` [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP Yoshiki Komachi
2020-07-31  4:44   ` [Bridge] " Yoshiki Komachi
2020-07-31 14:15   ` Jesper Dangaard Brouer [this message]
2020-07-31 14:15     ` Jesper Dangaard Brouer
2020-08-04 10:08     ` Yoshiki Komachi
2020-08-04 10:08       ` [Bridge] " Yoshiki Komachi
2020-07-31 17:48   ` Andrii Nakryiko
2020-07-31 17:48     ` [Bridge] " Andrii Nakryiko
2020-08-04 10:35     ` Yoshiki Komachi
2020-08-04 10:35       ` [Bridge] " Yoshiki Komachi
2020-07-31 21:52 ` [RFC PATCH bpf-next 0/3] Add a new bpf helper for FDB lookup John Fastabend
2020-07-31 21:52   ` [Bridge] " John Fastabend
2020-08-05 10:26   ` Yoshiki Komachi
2020-08-05 10:26     ` [Bridge] " Yoshiki Komachi
2020-08-05 16:36     ` David Ahern
2020-08-05 16:36       ` [Bridge] " David Ahern
2020-08-07  8:30       ` Yoshiki Komachi
2020-08-07  8:30         ` [Bridge] " Yoshiki Komachi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200731161519.5f413f82@carbon \
    --to=brouer@redhat.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=komachi.yoshiki@gmail.com \
    --cc=kpsingh@chromium.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.