* [RFC: add openvswitch actions using BPF 0/2]
@ 2015-02-04 22:48 Andy Zhou
[not found] ` <1423090122-19807-1-git-send-email-azhou-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Andy Zhou @ 2015-02-04 22:48 UTC (permalink / raw)
To: dev-yBygre7rU0SM8Zsap4Y0gw; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
Joe and I have been experimenting with BPF and its application for OVS.
This patch shows our attempt to implement ovs actions using eBPF.
The kernel changes are against the 'net-next'. The corresponding
user space changes will be post next.
This patch set implements an BPF action, that has the same interface
as current OVS output action. Instead of sending out a packet, it
only generates a kernel message.
This feature is neither complete nor useful as is. We are mostly
interested in comments on: The infrastructure changes to support
and running BPF functions, and suggestions on extensions beyond
those patches.
Andy Zhou (2):
BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH
openvswitch: implements the BPF_PROG action in datapath
include/linux/bpf.h | 2 +-
include/uapi/linux/bpf.h | 1 +
include/uapi/linux/openvswitch.h | 29 ++++++++++++-
net/Makefile | 4 +-
net/openvswitch/Makefile | 2 +
net/openvswitch/actions.c | 30 +++++++++++++
net/openvswitch/bpf.c | 87 +++++++++++++++++++++++++++++++++++++
net/openvswitch/datapath.c | 6 ++-
net/openvswitch/flow_netlink.c | 92 +++++++++++++++++++++++++++++++++++++++-
net/openvswitch/flow_netlink.h | 8 ++++
10 files changed, 254 insertions(+), 7 deletions(-)
create mode 100644 net/openvswitch/bpf.c
--
1.9.1
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH
[not found] ` <1423090122-19807-1-git-send-email-azhou-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
@ 2015-02-04 22:48 ` Andy Zhou
2015-02-05 14:48 ` [ovs-dev] " Thomas Graf
2015-02-04 22:48 ` [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath Andy Zhou
1 sibling, 1 reply; 8+ messages in thread
From: Andy Zhou @ 2015-02-04 22:48 UTC (permalink / raw)
To: dev-yBygre7rU0SM8Zsap4Y0gw; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
Add a new program type for openvswitch. Implements the BPF verifier
for the new type.
Signed-off-by: Andy Zhou <azhou@nicira.com>
---
include/linux/bpf.h | 2 +-
include/uapi/linux/bpf.h | 1 +
include/uapi/linux/openvswitch.h | 29 +++++++++++++-
net/Makefile | 4 +-
net/openvswitch/Makefile | 2 +
net/openvswitch/bpf.c | 87 ++++++++++++++++++++++++++++++++++++++++
6 files changed, 122 insertions(+), 3 deletions(-)
create mode 100644 net/openvswitch/bpf.c
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index bbfceb7..2e71cc2 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -99,7 +99,7 @@ enum bpf_access_type {
struct bpf_verifier_ops {
/* return eBPF function prototype for verification */
- const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
+ const struct bpf_func_proto *(*get_func_proto)(int func_id);
/* return true if 'size' wide access at offset 'off' within bpf_context
* with 'type' (read or write) is allowed
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 45da7ec..a9a6b24 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -118,6 +118,7 @@ enum bpf_map_type {
enum bpf_prog_type {
BPF_PROG_TYPE_UNSPEC,
BPF_PROG_TYPE_SOCKET_FILTER,
+ BPF_PROG_TYPE_OPENVSWITCH,
};
/* flags for BPF_MAP_UPDATE_ELEM command */
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 7a8785a..929999c 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -1,6 +1,6 @@
/*
- * Copyright (c) 2007-2013 Nicira, Inc.
+ * Copyright (c) 2007-2015 Nicira, Inc.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of version 2 of the GNU General Public
@@ -569,6 +569,17 @@ struct ovs_action_push_vlan {
__be16 vlan_tci; /* 802.1Q TCI (VLAN ID and priority). */
};
+/**
+ * struct ovs_action_bpf_prog - %OVS_ACTION_ATTR_BPF_PROG action argument.
+ *
+ * XXX The argument size is fixed for now.
+ */
+struct ovs_action_bpf_prog {
+ __be32 prog_fd;
+ __be32 arg0;
+ __be32 arg1;
+};
+
/* Data path hash algorithm for computing Datapath hash.
*
* The algorithm type only specifies the fields in a flow
@@ -631,10 +642,26 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_HASH, /* struct ovs_action_hash. */
OVS_ACTION_ATTR_PUSH_MPLS, /* struct ovs_action_push_mpls. */
OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */
+ OVS_ACTION_ATTR_SET_MASKED, /* place holder */
+ OVS_ACTION_ATTR_BPF_PROG, /* strcut ovs_action_bpf_prog */
__OVS_ACTION_ATTR_MAX
};
#define OVS_ACTION_ATTR_MAX (__OVS_ACTION_ATTR_MAX - 1)
+/* integer value in 'imm' field of BPF_CALL instruction selects which OVS helper
+ * function eBPF program intends to call
+ */
+enum ovs_bpf_func_id {
+ OVS_BPF_FUNC_unspec,
+ OVS_BPF_FUNC_output, /* int ovs_bpf_output(ctxt) */
+ __OVS_BPF_FUNC_MAX_ID,
+};
+
+struct ovs_bpf_action_ctxt {
+ void *skb;
+ u32 arg0;
+ u32 arg1;
+};
#endif /* _LINUX_OPENVSWITCH_H */
diff --git a/net/Makefile b/net/Makefile
index 38704bd..7e92dea 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -67,7 +67,9 @@ obj-$(CONFIG_DNS_RESOLVER) += dns_resolver/
obj-$(CONFIG_CEPH_LIB) += ceph/
obj-$(CONFIG_BATMAN_ADV) += batman-adv/
obj-$(CONFIG_NFC) += nfc/
-obj-$(CONFIG_OPENVSWITCH) += openvswitch/
+ifneq ($(CONFIG_OPENVSWITCH),)
+obj-y += openvswitch/
+endif
obj-$(CONFIG_VSOCKETS) += vmw_vsock/
obj-$(CONFIG_NET_MPLS_GSO) += mpls/
obj-$(CONFIG_HSR) += hsr/
diff --git a/net/openvswitch/Makefile b/net/openvswitch/Makefile
index 91b9478..5a3a2b7 100644
--- a/net/openvswitch/Makefile
+++ b/net/openvswitch/Makefile
@@ -18,3 +18,5 @@ openvswitch-y := \
obj-$(CONFIG_OPENVSWITCH_GENEVE)+= vport-geneve.o
obj-$(CONFIG_OPENVSWITCH_VXLAN) += vport-vxlan.o
obj-$(CONFIG_OPENVSWITCH_GRE) += vport-gre.o
+
+obj-y += bpf.o
diff --git a/net/openvswitch/bpf.c b/net/openvswitch/bpf.c
new file mode 100644
index 0000000..8a33e93
--- /dev/null
+++ b/net/openvswitch/bpf.c
@@ -0,0 +1,87 @@
+/* Copyright (c) 2015 Nicira Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/err.h>
+#include <linux/bpf.h>
+#include <linux/openvswitch.h>
+#include <linux/skbuff.h>
+
+static u64 bpf_helper_output(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+ struct sk_buff *skb = (struct sk_buff *) (unsigned long) r1;
+ uint32_t port = (uint32_t) (unsigned long) r2;
+
+ printk("helper output %p to port %d\n", skb, port);
+ return 0;
+}
+
+struct bpf_func_proto bpf_helper_output_proto = {
+ .func = bpf_helper_output,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_ANYTHING, /* XXX from context */
+ .arg2_type = ARG_ANYTHING,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_ANYTHING,
+};
+
+#define BPF_CONTEXT_ACCESS(CTXT, FIELD, RW) \
+ [offsetof(struct CTXT, FIELD)] = { \
+ FIELD_SIZEOF(struct CTXT, FIELD), \
+ RW \
+ }
+
+static const struct bpf_func_proto *ovs_func_proto(int func_id)
+{
+ switch (func_id) {
+ case OVS_BPF_FUNC_output:
+ return &bpf_helper_output_proto;
+ default:
+ return NULL;
+ }
+}
+
+static const struct bpf_context_access {
+ int size;
+ enum bpf_access_type type;
+} bpf_ctx_access[] = {
+ BPF_CONTEXT_ACCESS(ovs_bpf_action_ctxt, skb, BPF_READ),
+ BPF_CONTEXT_ACCESS(ovs_bpf_action_ctxt, arg0, BPF_READ),
+ BPF_CONTEXT_ACCESS(ovs_bpf_action_ctxt, arg1, BPF_READ)
+};
+
+static bool test_is_valid_access(int off, int size, enum bpf_access_type type)
+{
+ const struct bpf_context_access *access;
+
+ if (off < 0 || off >= ARRAY_SIZE(bpf_ctx_access))
+ return false;
+
+ access = &bpf_ctx_access[off];
+ if (access->size == size && (access->type & type))
+ return true;
+
+ return false;
+}
+
+static struct bpf_verifier_ops ovs_bpf_ops = {
+ .get_func_proto = ovs_func_proto,
+ .is_valid_access = test_is_valid_access,
+};
+
+static struct bpf_prog_type_list tl_prog = {
+ .ops = &ovs_bpf_ops,
+ .type = BPF_PROG_TYPE_OPENVSWITCH,
+};
+
+static int __init register_ovs_bpf_ops(void)
+{
+ bpf_register_prog_type(&tl_prog);
+ return 0;
+}
+late_initcall(register_ovs_bpf_ops);
--
1.9.1
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath
[not found] ` <1423090122-19807-1-git-send-email-azhou-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
2015-02-04 22:48 ` [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH Andy Zhou
@ 2015-02-04 22:48 ` Andy Zhou
2015-02-05 15:07 ` [ovs-dev] " Thomas Graf
1 sibling, 1 reply; 8+ messages in thread
From: Andy Zhou @ 2015-02-04 22:48 UTC (permalink / raw)
To: dev-yBygre7rU0SM8Zsap4Y0gw; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
BPF_PROG action allows an action to be implemented in eBPF language and
downloaded by the userspace at runtime.
Signed-off-by: Andy Zhou <azhou@nicira.com>
---
net/openvswitch/actions.c | 30 ++++++++++++++
net/openvswitch/datapath.c | 6 ++-
net/openvswitch/flow_netlink.c | 92 +++++++++++++++++++++++++++++++++++++++++-
net/openvswitch/flow_netlink.h | 8 ++++
4 files changed, 132 insertions(+), 4 deletions(-)
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index b4cffe6..29e9171 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -38,8 +38,12 @@
#include "datapath.h"
#include "flow.h"
+#include "flow_netlink.h"
#include "vport.h"
+typedef int (*ovs_bpf_func_t)(const struct ovs_bpf_action_ctxt *,
+ const struct bpf_insn *);
+
static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
struct sw_flow_key *key,
const struct nlattr *attr, int len);
@@ -747,6 +751,28 @@ static int execute_recirc(struct datapath *dp, struct sk_buff *skb,
return 0;
}
+static int execute_bpf(struct sk_buff *skb, struct sw_flow_key *key,
+ const struct nlattr *a)
+{
+ struct ovs_action_bpf_runtime *rt;
+ struct bpf_prog *prog;
+ struct ovs_bpf_action_ctxt ctxt;
+ ovs_bpf_func_t ovs_bpf_func;
+ int err;
+
+ rt = nla_data(a);
+ prog = rt->prog;
+
+ /* Build the BPF program runtime context. */
+ ctxt.skb = (void *)skb;
+ ctxt.arg0 = rt->arg0;
+ ctxt.arg1 = rt->arg1;
+
+ ovs_bpf_func = (ovs_bpf_func_t)(prog->bpf_func);
+ err = ovs_bpf_func(&ctxt, prog->insnsi);
+ return err;
+}
+
/* Execute a list of actions against 'skb'. */
static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
struct sw_flow_key *key,
@@ -814,6 +840,10 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
}
break;
+ case OVS_ACTION_ATTR_BPF_PROG:
+ err = execute_bpf(skb, key, a);
+ break;
+
case OVS_ACTION_ATTR_SET:
err = execute_set_action(skb, key, nla_data(a));
break;
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index ae5e77c..810a0bf 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -699,6 +699,8 @@ static size_t ovs_flow_cmd_msg_size(const struct sw_flow_actions *acts,
len += nla_total_size(ovs_key_attr_size());
/* OVS_FLOW_ATTR_ACTIONS */
+ /* XXX this logic needs to be fixed to accommodate BPF_PROG action
+ * will expand the run time action size. */
if (should_fill_actions(ufid_flags))
len += nla_total_size(acts->actions_len);
@@ -1017,7 +1019,7 @@ err_unlock_ovs:
ovs_unlock();
kfree_skb(reply);
err_kfree_acts:
- kfree(acts);
+ free_flow_actions(acts);
err_kfree_flow:
ovs_flow_free(new_flow, false);
error:
@@ -1152,7 +1154,7 @@ err_unlock_ovs:
ovs_unlock();
kfree_skb(reply);
err_kfree_acts:
- kfree(acts);
+ free_flow_actions(acts);
error:
return error;
}
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 8b9a612..466e85f 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -42,6 +42,7 @@
#include <linux/icmp.h>
#include <linux/icmpv6.h>
#include <linux/rculist.h>
+#include <linux/bpf.h>
#include <net/geneve.h>
#include <net/ip.h>
#include <net/ipv6.h>
@@ -1546,11 +1547,38 @@ static struct sw_flow_actions *nla_alloc_flow_actions(int size, bool log)
return sfa;
}
+void free_flow_actions(struct sw_flow_actions *sf_acts)
+{
+ const struct nlattr *a;
+ int rem;
+ struct ovs_action_bpf_runtime *rt;
+
+ nla_for_each_attr(a, sf_acts->actions, sf_acts->actions_len, rem) {
+ int type = nla_type(a);
+
+ switch (type) {
+ case OVS_ACTION_ATTR_BPF_PROG:
+ rt = nla_data(a);
+ bpf_prog_put(rt->prog);
+ break;
+ }
+ }
+
+ kfree(sf_acts);
+}
+
+static void free_flow_actions_rcu(struct rcu_head *head)
+{
+ struct sw_flow_actions *acts = container_of(head, struct sw_flow_actions, rcu);
+
+ free_flow_actions(acts);
+}
+
/* Schedules 'sf_acts' to be freed after the next RCU grace period.
* The caller must hold rcu_read_lock for this to be sensible. */
void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts)
{
- kfree_rcu(sf_acts, rcu);
+ call_rcu(&sf_acts->rcu, free_flow_actions_rcu);
}
static struct nlattr *reserve_sfa_size(struct sw_flow_actions **sfa,
@@ -1695,6 +1723,37 @@ static int validate_and_copy_sample(const struct nlattr *attr,
return 0;
}
+static int validate_and_copy_bpf(const struct nlattr *attr,
+ struct sw_flow_actions **sfa,
+ bool log)
+{
+ const struct ovs_action_bpf_prog *act_bpf = nla_data(attr);
+ struct ovs_action_bpf_runtime rt;
+ u32 fd;
+ int err;
+
+ fd = ntohl(act_bpf->prog_fd);
+ rt.prog = bpf_prog_get(fd);
+ if (!rt.prog)
+ return -EINVAL;
+
+ if (rt.prog->aux->prog_type != BPF_PROG_TYPE_OPENVSWITCH) {
+ bpf_prog_put(rt.prog);
+ return -EINVAL;
+ }
+
+ rt.fd = fd;
+ rt.arg0 = ntohl(act_bpf->arg0);
+ rt.arg1 = ntohl(act_bpf->arg1);
+
+ /* Validation done. Rewrite the action with runtime datastructure. */
+ err = add_action(sfa, OVS_ACTION_ATTR_BPF_PROG, &rt, sizeof(rt), log);
+ if (err)
+ return err;
+
+ return 0;
+}
+
static int validate_tp_port(const struct sw_flow_key *flow_key,
__be16 eth_type)
{
@@ -1966,7 +2025,9 @@ static int __ovs_nla_copy_actions(const struct nlattr *attr,
[OVS_ACTION_ATTR_POP_VLAN] = 0,
[OVS_ACTION_ATTR_SET] = (u32)-1,
[OVS_ACTION_ATTR_SAMPLE] = (u32)-1,
- [OVS_ACTION_ATTR_HASH] = sizeof(struct ovs_action_hash)
+ [OVS_ACTION_ATTR_HASH] = sizeof(struct ovs_action_hash),
+ [OVS_ACTION_ATTR_BPF_PROG] =
+ sizeof(struct ovs_action_bpf_prog),
};
const struct ovs_action_push_vlan *vlan;
int type = nla_type(a);
@@ -2073,6 +2134,13 @@ static int __ovs_nla_copy_actions(const struct nlattr *attr,
skip_copy = true;
break;
+ case OVS_ACTION_ATTR_BPF_PROG:
+ err = validate_and_copy_bpf(a, sfa, log);
+ if (err)
+ return err;
+ skip_copy = true;
+ break;
+
default:
OVS_NLERR(log, "Unknown Action type %d", type);
return -EINVAL;
@@ -2177,6 +2245,21 @@ static int set_action_to_attr(const struct nlattr *a, struct sk_buff *skb)
return 0;
}
+static int bpf_action_to_attr(const struct nlattr *a, struct sk_buff *skb)
+{
+ const struct ovs_action_bpf_runtime *rt = nla_data(a);
+ struct ovs_action_bpf_prog prog;
+
+ prog.prog_fd = htonl(rt->fd);
+ prog.arg0 = htonl(rt->arg0);
+ prog.arg1 = htonl(rt->arg1);
+
+ if (nla_put(skb, OVS_ACTION_ATTR_BPF_PROG, sizeof(prog), &prog));
+ return -EMSGSIZE;
+
+ return 0;
+}
+
int ovs_nla_put_actions(const struct nlattr *attr, int len, struct sk_buff *skb)
{
const struct nlattr *a;
@@ -2197,6 +2280,11 @@ int ovs_nla_put_actions(const struct nlattr *attr, int len, struct sk_buff *skb)
if (err)
return err;
break;
+
+ case OVS_ACTION_ATTR_BPF_PROG:
+ err = bpf_action_to_attr(a, skb);
+ break;
+
default:
if (nla_put(skb, type, nla_len(a), nla_data(a)))
return -EMSGSIZE;
diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h
index 5c3d75b..e986ddb 100644
--- a/net/openvswitch/flow_netlink.h
+++ b/net/openvswitch/flow_netlink.h
@@ -68,6 +68,14 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
int ovs_nla_put_actions(const struct nlattr *attr,
int len, struct sk_buff *skb);
+void free_flow_actions(struct sw_flow_actions *);
void ovs_nla_free_flow_actions(struct sw_flow_actions *);
+struct ovs_action_bpf_runtime {
+ uint32_t fd;
+ uint32_t arg0;
+ uint32_t arg1;
+ struct bpf_prog *prog;
+};
+
#endif /* flow_netlink.h */
--
1.9.1
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [ovs-dev] [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH
2015-02-04 22:48 ` [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH Andy Zhou
@ 2015-02-05 14:48 ` Thomas Graf
2015-02-05 19:47 ` Andy Zhou
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Graf @ 2015-02-05 14:48 UTC (permalink / raw)
To: Andy Zhou; +Cc: dev, netdev
On 02/04/15 at 02:48pm, Andy Zhou wrote:
> struct bpf_verifier_ops {
> /* return eBPF function prototype for verification */
> - const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
> + const struct bpf_func_proto *(*get_func_proto)(int func_id);
This change should maybe go in a separate commit.
> +static const struct bpf_func_proto *ovs_func_proto(int func_id)
> +{
> + switch (func_id) {
> + case OVS_BPF_FUNC_output:
> + return &bpf_helper_output_proto;
> + default:
> + return NULL;
> + }
> +}
You'd still want to use the map helpers so it seems like we should
change the bpf verified to verify against both a global and type
specific list unless we want to add all the map helpers to
ovs_func_proto as well.
> +static bool test_is_valid_access(int off, int size, enum bpf_access_type type)
> +{
> + const struct bpf_context_access *access;
> +
> + if (off < 0 || off >= ARRAY_SIZE(bpf_ctx_access))
> + return false;
> +
> + access = &bpf_ctx_access[off];
> + if (access->size == size && (access->type & type))
> + return true;
> +
> + return false;
> +}
OK. I see why you kept ctxt simple at first ;-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ovs-dev] [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath
2015-02-04 22:48 ` [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath Andy Zhou
@ 2015-02-05 15:07 ` Thomas Graf
2015-02-05 19:34 ` Andy Zhou
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Graf @ 2015-02-05 15:07 UTC (permalink / raw)
To: Andy Zhou; +Cc: dev, netdev
On 02/04/15 at 02:48pm, Andy Zhou wrote:
> BPF_PROG action allows an action to be implemented in eBPF language and
> downloaded by the userspace at runtime.
>
> Signed-off-by: Andy Zhou <azhou@nicira.com>
Thanks a lot for putting this together Andy and Joe and everybody else
who was involved. This is much further than what I expected as a first
step.
One slight open from my side is the avoidance of versioning help of
helpers. We want to avoid v2, v3, ... helpers if the need should arise
to extend an existing helper.
I think it should be doable with BPF to have the verifier accept if
a helper is called with less args than expected, to initialize those
to 0. This would allow for helpers to support additional arguments.
I think this needs to be documented and expectations should be clear.
Other than that I'm very very happy with where this is going.
It seems very doable to also allow for a BPF prog to be registered
upon flow table miss.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ovs-dev] [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath
2015-02-05 15:07 ` [ovs-dev] " Thomas Graf
@ 2015-02-05 19:34 ` Andy Zhou
0 siblings, 0 replies; 8+ messages in thread
From: Andy Zhou @ 2015-02-05 19:34 UTC (permalink / raw)
To: Thomas Graf; +Cc: dev@openvswitch.com, netdev@vger.kernel.org
On Thu, Feb 5, 2015 at 7:07 AM, Thomas Graf <tgraf@noironetworks.com> wrote:
> On 02/04/15 at 02:48pm, Andy Zhou wrote:
>> BPF_PROG action allows an action to be implemented in eBPF language and
>> downloaded by the userspace at runtime.
>>
>> Signed-off-by: Andy Zhou <azhou@nicira.com>
>
> Thanks a lot for putting this together Andy and Joe and everybody else
> who was involved. This is much further than what I expected as a first
> step.
>
> One slight open from my side is the avoidance of versioning help of
> helpers. We want to avoid v2, v3, ... helpers if the need should arise
> to extend an existing helper.
I share the concern. Addressing this upfront is a good idea. On the other hand
I am not sure if this is completely avoidable at a reasonable cost.
>
> I think it should be doable with BPF to have the verifier accept if
> a helper is called with less args than expected, to initialize those
> to 0. This would allow for helpers to support additional arguments.
I am not sure it is fundamentally better than V2, v3... On the other hand,
I agree this looks technically possible.
>
> I think this needs to be documented and expectations should be clear.
> Other than that I'm very very happy with where this is going.
Agreed,
> It seems very doable to also allow for a BPF prog to be registered
> upon flow table miss.
Yes, this is possible, but at the cost of flow set up rate. It may
still be practical
with some optimization, such as caching action lists so we don't
regenerate or download
the same program.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ovs-dev] [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH
2015-02-05 14:48 ` [ovs-dev] " Thomas Graf
@ 2015-02-05 19:47 ` Andy Zhou
0 siblings, 0 replies; 8+ messages in thread
From: Andy Zhou @ 2015-02-05 19:47 UTC (permalink / raw)
To: Thomas Graf; +Cc: dev@openvswitch.com, netdev@vger.kernel.org
On Thu, Feb 5, 2015 at 6:48 AM, Thomas Graf <tgraf@noironetworks.com> wrote:
> On 02/04/15 at 02:48pm, Andy Zhou wrote:
>> struct bpf_verifier_ops {
>> /* return eBPF function prototype for verification */
>> - const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
>> + const struct bpf_func_proto *(*get_func_proto)(int func_id);
>
> This change should maybe go in a separate commit.
Agreed. Ideally, each program type should have its own func_id space. Current
registration does not allow this. Should this be added?
In the interest of reusing common helper functions, should we also consider
partition the func_id space into common ones and program type specific ones?
For example, the prink like function can be specified the common func_id space.
>
>> +static const struct bpf_func_proto *ovs_func_proto(int func_id)
>> +{
>> + switch (func_id) {
>> + case OVS_BPF_FUNC_output:
>> + return &bpf_helper_output_proto;
>> + default:
>> + return NULL;
>> + }
>> +}
>
> You'd still want to use the map helpers so it seems like we should
> change the bpf verified to verify against both a global and type
> specific list unless we want to add all the map helpers to
> ovs_func_proto as well.
It is not clear what OVS can benefit from the map helpers. Map provides
a fixed sized array. OVS data structure, such as flow, are more
dynamic and non-contiguous in memory.
Is extending BPF to allow for passing those dynamic objects a reasonable
direction?
>
>> +static bool test_is_valid_access(int off, int size, enum bpf_access_type type)
>> +{
>> + const struct bpf_context_access *access;
>> +
>> + if (off < 0 || off >= ARRAY_SIZE(bpf_ctx_access))
>> + return false;
>> +
>> + access = &bpf_ctx_access[off];
>> + if (access->size == size && (access->type & type))
>> + return true;
>> +
>> + return false;
>> +}
>
> OK. I see why you kept ctxt simple at first ;-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ovs-dev] [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH
@ 2015-02-05 20:30 Alexei Starovoitov
0 siblings, 0 replies; 8+ messages in thread
From: Alexei Starovoitov @ 2015-02-05 20:30 UTC (permalink / raw)
To: Andy Zhou; +Cc: Thomas Graf, dev@openvswitch.com, netdev@vger.kernel.org
On Thu, Feb 5, 2015 at 11:47 AM, Andy Zhou <azhou@nicira.com> wrote:
> On Thu, Feb 5, 2015 at 6:48 AM, Thomas Graf <tgraf@noironetworks.com> wrote:
>> On 02/04/15 at 02:48pm, Andy Zhou wrote:
>>> struct bpf_verifier_ops {
>>> /* return eBPF function prototype for verification */
>>> - const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
>>> + const struct bpf_func_proto *(*get_func_proto)(int func_id);
>>
>> This change should maybe go in a separate commit.
> Agreed. Ideally, each program type should have its own func_id space. Current
> registration does not allow this. Should this be added?
>
> In the interest of reusing common helper functions, should we also consider
> partition the func_id space into common ones and program type specific ones?
> For example, the prink like function can be specified the common func_id space.
I don't like per-program_type func_ids, since overlapping ids is
a debugging headache and it encourages hiding ids instead of
clearly adding them to uapi's enum bpf_func_id.
per-program type ids also make it impossible to share common helper ids
which is already the case between socket and tracing prog types.
Having one place and one enum for all ids helps to keep
exposed user helpers under control.
>>> +static const struct bpf_func_proto *ovs_func_proto(int func_id)
>>> +{
>>> + switch (func_id) {
>>> + case OVS_BPF_FUNC_output:
>>> + return &bpf_helper_output_proto;
>>> + default:
>>> + return NULL;
>>> + }
>>> +}
>>
>> You'd still want to use the map helpers so it seems like we should
>> change the bpf verified to verify against both a global and type
>> specific list unless we want to add all the map helpers to
>> ovs_func_proto as well.
>
> It is not clear what OVS can benefit from the map helpers. Map provides
> a fixed sized array. OVS data structure, such as flow, are more
> dynamic and non-contiguous in memory.
that's a misunderstanding.
maps can be different types including rhashtable type.
Actually I was waiting for rhashtable to stabilize
before adding it as new map type ;)
There is 'max_entries' limit that must be there for safety reasons,
but it doesn't mean that all entries are always allocated at
init time and never change.
contiguous vs non-contiguous is also internal detail of map
implementation.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-02-05 20:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-04 22:48 [RFC: add openvswitch actions using BPF 0/2] Andy Zhou
[not found] ` <1423090122-19807-1-git-send-email-azhou-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
2015-02-04 22:48 ` [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH Andy Zhou
2015-02-05 14:48 ` [ovs-dev] " Thomas Graf
2015-02-05 19:47 ` Andy Zhou
2015-02-04 22:48 ` [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath Andy Zhou
2015-02-05 15:07 ` [ovs-dev] " Thomas Graf
2015-02-05 19:34 ` Andy Zhou
-- strict thread matches above, loose matches on Subject: below --
2015-02-05 20:30 [ovs-dev] [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).