* [GIT net-next] Open vSwitch @ 2012-09-04 19:14 Jesse Gross [not found] ` <1346786049-3100-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2012-09-04 19:14 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA Two feature additions to Open vSwitch for net-next/3.7. The following changes since commit 0d7614f09c1ebdbaa1599a5aba7593f147bf96ee: Linux 3.6-rc1 (2012-08-02 16:38:10 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 15eac2a74277bc7de68a7c2a64a7c91b4b6f5961: openvswitch: Increase maximum number of datapath ports. (2012-09-03 19:20:49 -0700) ---------------------------------------------------------------- Pravin B Shelar (2): openvswitch: Add support for network namespaces. openvswitch: Increase maximum number of datapath ports. net/openvswitch/actions.c | 2 +- net/openvswitch/datapath.c | 375 +++++++++++++++++++++------------- net/openvswitch/datapath.h | 50 ++++- net/openvswitch/dp_notify.c | 8 +- net/openvswitch/flow.c | 11 +- net/openvswitch/flow.h | 3 +- net/openvswitch/vport-internal_dev.c | 7 +- net/openvswitch/vport-netdev.c | 2 +- net/openvswitch/vport.c | 23 ++- net/openvswitch/vport.h | 7 +- 10 files changed, 317 insertions(+), 171 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <1346786049-3100-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>]
* [PATCH net-next 1/2] openvswitch: Add support for network namespaces. [not found] ` <1346786049-3100-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> @ 2012-09-04 19:14 ` Jesse Gross 2012-09-04 19:14 ` [PATCH net-next 2/2] openvswitch: Increase maximum number of datapath ports Jesse Gross 2012-09-04 19:26 ` [GIT net-next] Open vSwitch David Miller 2 siblings, 0 replies; 46+ messages in thread From: Jesse Gross @ 2012-09-04 19:14 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA From: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Following patch adds support for network namespace to openvswitch. Since it must release devices when namespaces are destroyed, a side effect of this patch is that the module no longer keeps a refcount but instead cleans up any state when it is unloaded. Signed-off-by: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Signed-off-by: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> --- net/openvswitch/datapath.c | 269 ++++++++++++++++++++-------------- net/openvswitch/datapath.h | 19 ++- net/openvswitch/dp_notify.c | 8 +- net/openvswitch/vport-internal_dev.c | 7 +- net/openvswitch/vport-netdev.c | 2 +- net/openvswitch/vport.c | 22 ++- net/openvswitch/vport.h | 3 +- 7 files changed, 207 insertions(+), 123 deletions(-) diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index d8277d2..cad39fc 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -49,12 +49,29 @@ #include <linux/dmi.h> #include <linux/workqueue.h> #include <net/genetlink.h> +#include <net/net_namespace.h> +#include <net/netns/generic.h> #include "datapath.h" #include "flow.h" #include "vport-internal_dev.h" /** + * struct ovs_net - Per net-namespace data for ovs. + * @dps: List of datapaths to enable dumping them all out. + * Protected by genl_mutex. + */ +struct ovs_net { + struct list_head dps; +}; + +static int ovs_net_id __read_mostly; + +#define REHASH_FLOW_INTERVAL (10 * 60 * HZ) +static void rehash_flow_table(struct work_struct *work); +static DECLARE_DELAYED_WORK(rehash_flow_wq, rehash_flow_table); + +/** * DOC: Locking: * * Writes to device state (add/remove datapath, port, set operations on vports, @@ -71,29 +88,21 @@ * each other. */ -/* Global list of datapaths to enable dumping them all out. - * Protected by genl_mutex. - */ -static LIST_HEAD(dps); - -#define REHASH_FLOW_INTERVAL (10 * 60 * HZ) -static void rehash_flow_table(struct work_struct *work); -static DECLARE_DELAYED_WORK(rehash_flow_wq, rehash_flow_table); - static struct vport *new_vport(const struct vport_parms *); -static int queue_gso_packets(int dp_ifindex, struct sk_buff *, +static int queue_gso_packets(struct net *, int dp_ifindex, struct sk_buff *, const struct dp_upcall_info *); -static int queue_userspace_packet(int dp_ifindex, struct sk_buff *, +static int queue_userspace_packet(struct net *, int dp_ifindex, + struct sk_buff *, const struct dp_upcall_info *); /* Must be called with rcu_read_lock, genl_mutex, or RTNL lock. */ -static struct datapath *get_dp(int dp_ifindex) +static struct datapath *get_dp(struct net *net, int dp_ifindex) { struct datapath *dp = NULL; struct net_device *dev; rcu_read_lock(); - dev = dev_get_by_index_rcu(&init_net, dp_ifindex); + dev = dev_get_by_index_rcu(net, dp_ifindex); if (dev) { struct vport *vport = ovs_internal_dev_get_vport(dev); if (vport) @@ -135,6 +144,7 @@ static void destroy_dp_rcu(struct rcu_head *rcu) ovs_flow_tbl_destroy((__force struct flow_table *)dp->table); free_percpu(dp->stats_percpu); + release_net(ovs_dp_get_net(dp)); kfree(dp); } @@ -220,11 +230,12 @@ static struct genl_family dp_packet_genl_family = { .hdrsize = sizeof(struct ovs_header), .name = OVS_PACKET_FAMILY, .version = OVS_PACKET_VERSION, - .maxattr = OVS_PACKET_ATTR_MAX + .maxattr = OVS_PACKET_ATTR_MAX, + .netnsok = true }; int ovs_dp_upcall(struct datapath *dp, struct sk_buff *skb, - const struct dp_upcall_info *upcall_info) + const struct dp_upcall_info *upcall_info) { struct dp_stats_percpu *stats; int dp_ifindex; @@ -242,9 +253,9 @@ int ovs_dp_upcall(struct datapath *dp, struct sk_buff *skb, } if (!skb_is_gso(skb)) - err = queue_userspace_packet(dp_ifindex, skb, upcall_info); + err = queue_userspace_packet(ovs_dp_get_net(dp), dp_ifindex, skb, upcall_info); else - err = queue_gso_packets(dp_ifindex, skb, upcall_info); + err = queue_gso_packets(ovs_dp_get_net(dp), dp_ifindex, skb, upcall_info); if (err) goto err; @@ -260,7 +271,8 @@ err: return err; } -static int queue_gso_packets(int dp_ifindex, struct sk_buff *skb, +static int queue_gso_packets(struct net *net, int dp_ifindex, + struct sk_buff *skb, const struct dp_upcall_info *upcall_info) { unsigned short gso_type = skb_shinfo(skb)->gso_type; @@ -276,7 +288,7 @@ static int queue_gso_packets(int dp_ifindex, struct sk_buff *skb, /* Queue all of the segments. */ skb = segs; do { - err = queue_userspace_packet(dp_ifindex, skb, upcall_info); + err = queue_userspace_packet(net, dp_ifindex, skb, upcall_info); if (err) break; @@ -306,7 +318,8 @@ static int queue_gso_packets(int dp_ifindex, struct sk_buff *skb, return err; } -static int queue_userspace_packet(int dp_ifindex, struct sk_buff *skb, +static int queue_userspace_packet(struct net *net, int dp_ifindex, + struct sk_buff *skb, const struct dp_upcall_info *upcall_info) { struct ovs_header *upcall; @@ -362,7 +375,7 @@ static int queue_userspace_packet(int dp_ifindex, struct sk_buff *skb, skb_copy_and_csum_dev(skb, nla_data(nla)); - err = genlmsg_unicast(&init_net, user_skb, upcall_info->pid); + err = genlmsg_unicast(net, user_skb, upcall_info->pid); out: kfree_skb(nskb); @@ -370,15 +383,10 @@ out: } /* Called with genl_mutex. */ -static int flush_flows(int dp_ifindex) +static int flush_flows(struct datapath *dp) { struct flow_table *old_table; struct flow_table *new_table; - struct datapath *dp; - - dp = get_dp(dp_ifindex); - if (!dp) - return -ENODEV; old_table = genl_dereference(dp->table); new_table = ovs_flow_tbl_alloc(TBL_MIN_BUCKETS); @@ -668,7 +676,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) packet->priority = flow->key.phy.priority; rcu_read_lock(); - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); err = -ENODEV; if (!dp) goto err_unlock; @@ -742,7 +750,8 @@ static struct genl_family dp_flow_genl_family = { .hdrsize = sizeof(struct ovs_header), .name = OVS_FLOW_FAMILY, .version = OVS_FLOW_VERSION, - .maxattr = OVS_FLOW_ATTR_MAX + .maxattr = OVS_FLOW_ATTR_MAX, + .netnsok = true }; static struct genl_multicast_group ovs_dp_flow_multicast_group = { @@ -894,7 +903,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info) goto error; } - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); error = -ENODEV; if (!dp) goto error; @@ -995,7 +1004,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info) ovs_dp_flow_multicast_group.id, info->nlhdr, GFP_KERNEL); else - netlink_set_err(init_net.genl_sock, 0, + netlink_set_err(sock_net(skb->sk)->genl_sock, 0, ovs_dp_flow_multicast_group.id, PTR_ERR(reply)); return 0; @@ -1023,7 +1032,7 @@ static int ovs_flow_cmd_get(struct sk_buff *skb, struct genl_info *info) if (err) return err; - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); if (!dp) return -ENODEV; @@ -1052,16 +1061,17 @@ static int ovs_flow_cmd_del(struct sk_buff *skb, struct genl_info *info) int err; int key_len; + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); + if (!dp) + return -ENODEV; + if (!a[OVS_FLOW_ATTR_KEY]) - return flush_flows(ovs_header->dp_ifindex); + return flush_flows(dp); + err = ovs_flow_from_nlattrs(&key, &key_len, a[OVS_FLOW_ATTR_KEY]); if (err) return err; - dp = get_dp(ovs_header->dp_ifindex); - if (!dp) - return -ENODEV; - table = genl_dereference(dp->table); flow = ovs_flow_tbl_lookup(table, &key, key_len); if (!flow) @@ -1090,7 +1100,7 @@ static int ovs_flow_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) struct datapath *dp; struct flow_table *table; - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); if (!dp) return -ENODEV; @@ -1152,7 +1162,8 @@ static struct genl_family dp_datapath_genl_family = { .hdrsize = sizeof(struct ovs_header), .name = OVS_DATAPATH_FAMILY, .version = OVS_DATAPATH_VERSION, - .maxattr = OVS_DP_ATTR_MAX + .maxattr = OVS_DP_ATTR_MAX, + .netnsok = true }; static struct genl_multicast_group ovs_dp_datapath_multicast_group = { @@ -1210,18 +1221,19 @@ static struct sk_buff *ovs_dp_cmd_build_info(struct datapath *dp, u32 pid, } /* Called with genl_mutex and optionally with RTNL lock also. */ -static struct datapath *lookup_datapath(struct ovs_header *ovs_header, +static struct datapath *lookup_datapath(struct net *net, + struct ovs_header *ovs_header, struct nlattr *a[OVS_DP_ATTR_MAX + 1]) { struct datapath *dp; if (!a[OVS_DP_ATTR_NAME]) - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(net, ovs_header->dp_ifindex); else { struct vport *vport; rcu_read_lock(); - vport = ovs_vport_locate(nla_data(a[OVS_DP_ATTR_NAME])); + vport = ovs_vport_locate(net, nla_data(a[OVS_DP_ATTR_NAME])); dp = vport && vport->port_no == OVSP_LOCAL ? vport->dp : NULL; rcu_read_unlock(); } @@ -1235,6 +1247,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) struct sk_buff *reply; struct datapath *dp; struct vport *vport; + struct ovs_net *ovs_net; int err; err = -EINVAL; @@ -1242,15 +1255,14 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) goto err; rtnl_lock(); - err = -ENODEV; - if (!try_module_get(THIS_MODULE)) - goto err_unlock_rtnl; err = -ENOMEM; dp = kzalloc(sizeof(*dp), GFP_KERNEL); if (dp == NULL) - goto err_put_module; + goto err_unlock_rtnl; + INIT_LIST_HEAD(&dp->port_list); + ovs_dp_set_net(dp, hold_net(sock_net(skb->sk))); /* Allocate table. */ err = -ENOMEM; @@ -1287,7 +1299,8 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) if (IS_ERR(reply)) goto err_destroy_local_port; - list_add_tail(&dp->list_node, &dps); + ovs_net = net_generic(ovs_dp_get_net(dp), ovs_net_id); + list_add_tail(&dp->list_node, &ovs_net->dps); rtnl_unlock(); genl_notify(reply, genl_info_net(info), info->snd_pid, @@ -1302,34 +1315,20 @@ err_destroy_percpu: err_destroy_table: ovs_flow_tbl_destroy(genl_dereference(dp->table)); err_free_dp: + release_net(ovs_dp_get_net(dp)); kfree(dp); -err_put_module: - module_put(THIS_MODULE); err_unlock_rtnl: rtnl_unlock(); err: return err; } -static int ovs_dp_cmd_del(struct sk_buff *skb, struct genl_info *info) +/* Called with genl_mutex. */ +static void __dp_destroy(struct datapath *dp) { struct vport *vport, *next_vport; - struct sk_buff *reply; - struct datapath *dp; - int err; rtnl_lock(); - dp = lookup_datapath(info->userhdr, info->attrs); - err = PTR_ERR(dp); - if (IS_ERR(dp)) - goto exit_unlock; - - reply = ovs_dp_cmd_build_info(dp, info->snd_pid, - info->snd_seq, OVS_DP_CMD_DEL); - err = PTR_ERR(reply); - if (IS_ERR(reply)) - goto exit_unlock; - list_for_each_entry_safe(vport, next_vport, &dp->port_list, node) if (vport->port_no != OVSP_LOCAL) ovs_dp_detach_port(vport); @@ -1345,17 +1344,32 @@ static int ovs_dp_cmd_del(struct sk_buff *skb, struct genl_info *info) rtnl_unlock(); call_rcu(&dp->rcu, destroy_dp_rcu); - module_put(THIS_MODULE); +} + +static int ovs_dp_cmd_del(struct sk_buff *skb, struct genl_info *info) +{ + struct sk_buff *reply; + struct datapath *dp; + int err; + + dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs); + err = PTR_ERR(dp); + if (IS_ERR(dp)) + return err; + + reply = ovs_dp_cmd_build_info(dp, info->snd_pid, + info->snd_seq, OVS_DP_CMD_DEL); + err = PTR_ERR(reply); + if (IS_ERR(reply)) + return err; + + __dp_destroy(dp); genl_notify(reply, genl_info_net(info), info->snd_pid, ovs_dp_datapath_multicast_group.id, info->nlhdr, GFP_KERNEL); return 0; - -exit_unlock: - rtnl_unlock(); - return err; } static int ovs_dp_cmd_set(struct sk_buff *skb, struct genl_info *info) @@ -1364,7 +1378,7 @@ static int ovs_dp_cmd_set(struct sk_buff *skb, struct genl_info *info) struct datapath *dp; int err; - dp = lookup_datapath(info->userhdr, info->attrs); + dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs); if (IS_ERR(dp)) return PTR_ERR(dp); @@ -1372,7 +1386,7 @@ static int ovs_dp_cmd_set(struct sk_buff *skb, struct genl_info *info) info->snd_seq, OVS_DP_CMD_NEW); if (IS_ERR(reply)) { err = PTR_ERR(reply); - netlink_set_err(init_net.genl_sock, 0, + netlink_set_err(sock_net(skb->sk)->genl_sock, 0, ovs_dp_datapath_multicast_group.id, err); return 0; } @@ -1389,7 +1403,7 @@ static int ovs_dp_cmd_get(struct sk_buff *skb, struct genl_info *info) struct sk_buff *reply; struct datapath *dp; - dp = lookup_datapath(info->userhdr, info->attrs); + dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs); if (IS_ERR(dp)) return PTR_ERR(dp); @@ -1403,11 +1417,12 @@ static int ovs_dp_cmd_get(struct sk_buff *skb, struct genl_info *info) static int ovs_dp_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) { + struct ovs_net *ovs_net = net_generic(sock_net(skb->sk), ovs_net_id); struct datapath *dp; int skip = cb->args[0]; int i = 0; - list_for_each_entry(dp, &dps, list_node) { + list_for_each_entry(dp, &ovs_net->dps, list_node) { if (i >= skip && ovs_dp_cmd_fill_info(dp, skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, NLM_F_MULTI, @@ -1459,7 +1474,8 @@ static struct genl_family dp_vport_genl_family = { .hdrsize = sizeof(struct ovs_header), .name = OVS_VPORT_FAMILY, .version = OVS_VPORT_VERSION, - .maxattr = OVS_VPORT_ATTR_MAX + .maxattr = OVS_VPORT_ATTR_MAX, + .netnsok = true }; struct genl_multicast_group ovs_dp_vport_multicast_group = { @@ -1525,14 +1541,15 @@ struct sk_buff *ovs_vport_cmd_build_info(struct vport *vport, u32 pid, } /* Called with RTNL lock or RCU read lock. */ -static struct vport *lookup_vport(struct ovs_header *ovs_header, +static struct vport *lookup_vport(struct net *net, + struct ovs_header *ovs_header, struct nlattr *a[OVS_VPORT_ATTR_MAX + 1]) { struct datapath *dp; struct vport *vport; if (a[OVS_VPORT_ATTR_NAME]) { - vport = ovs_vport_locate(nla_data(a[OVS_VPORT_ATTR_NAME])); + vport = ovs_vport_locate(net, nla_data(a[OVS_VPORT_ATTR_NAME])); if (!vport) return ERR_PTR(-ENODEV); if (ovs_header->dp_ifindex && @@ -1545,7 +1562,7 @@ static struct vport *lookup_vport(struct ovs_header *ovs_header, if (port_no >= DP_MAX_PORTS) return ERR_PTR(-EFBIG); - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(net, ovs_header->dp_ifindex); if (!dp) return ERR_PTR(-ENODEV); @@ -1574,7 +1591,7 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info) goto exit; rtnl_lock(); - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); err = -ENODEV; if (!dp) goto exit_unlock; @@ -1638,7 +1655,7 @@ static int ovs_vport_cmd_set(struct sk_buff *skb, struct genl_info *info) int err; rtnl_lock(); - vport = lookup_vport(info->userhdr, a); + vport = lookup_vport(sock_net(skb->sk), info->userhdr, a); err = PTR_ERR(vport); if (IS_ERR(vport)) goto exit_unlock; @@ -1658,7 +1675,7 @@ static int ovs_vport_cmd_set(struct sk_buff *skb, struct genl_info *info) reply = ovs_vport_cmd_build_info(vport, info->snd_pid, info->snd_seq, OVS_VPORT_CMD_NEW); if (IS_ERR(reply)) { - netlink_set_err(init_net.genl_sock, 0, + netlink_set_err(sock_net(skb->sk)->genl_sock, 0, ovs_dp_vport_multicast_group.id, PTR_ERR(reply)); goto exit_unlock; } @@ -1679,7 +1696,7 @@ static int ovs_vport_cmd_del(struct sk_buff *skb, struct genl_info *info) int err; rtnl_lock(); - vport = lookup_vport(info->userhdr, a); + vport = lookup_vport(sock_net(skb->sk), info->userhdr, a); err = PTR_ERR(vport); if (IS_ERR(vport)) goto exit_unlock; @@ -1714,7 +1731,7 @@ static int ovs_vport_cmd_get(struct sk_buff *skb, struct genl_info *info) int err; rcu_read_lock(); - vport = lookup_vport(ovs_header, a); + vport = lookup_vport(sock_net(skb->sk), ovs_header, a); err = PTR_ERR(vport); if (IS_ERR(vport)) goto exit_unlock; @@ -1741,7 +1758,7 @@ static int ovs_vport_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) u32 port_no; int retval; - dp = get_dp(ovs_header->dp_ifindex); + dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); if (!dp) return -ENODEV; @@ -1766,28 +1783,6 @@ static int ovs_vport_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) return retval; } -static void rehash_flow_table(struct work_struct *work) -{ - struct datapath *dp; - - genl_lock(); - - list_for_each_entry(dp, &dps, list_node) { - struct flow_table *old_table = genl_dereference(dp->table); - struct flow_table *new_table; - - new_table = ovs_flow_tbl_rehash(old_table); - if (!IS_ERR(new_table)) { - rcu_assign_pointer(dp->table, new_table); - ovs_flow_tbl_deferred_destroy(old_table); - } - } - - genl_unlock(); - - schedule_delayed_work(&rehash_flow_wq, REHASH_FLOW_INTERVAL); -} - static struct genl_ops dp_vport_genl_ops[] = { { .cmd = OVS_VPORT_CMD_NEW, .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN privilege. */ @@ -1872,6 +1867,59 @@ error: return err; } +static void rehash_flow_table(struct work_struct *work) +{ + struct datapath *dp; + struct net *net; + + genl_lock(); + rtnl_lock(); + for_each_net(net) { + struct ovs_net *ovs_net = net_generic(net, ovs_net_id); + + list_for_each_entry(dp, &ovs_net->dps, list_node) { + struct flow_table *old_table = genl_dereference(dp->table); + struct flow_table *new_table; + + new_table = ovs_flow_tbl_rehash(old_table); + if (!IS_ERR(new_table)) { + rcu_assign_pointer(dp->table, new_table); + ovs_flow_tbl_deferred_destroy(old_table); + } + } + } + rtnl_unlock(); + genl_unlock(); + + schedule_delayed_work(&rehash_flow_wq, REHASH_FLOW_INTERVAL); +} + +static int __net_init ovs_init_net(struct net *net) +{ + struct ovs_net *ovs_net = net_generic(net, ovs_net_id); + + INIT_LIST_HEAD(&ovs_net->dps); + return 0; +} + +static void __net_exit ovs_exit_net(struct net *net) +{ + struct ovs_net *ovs_net = net_generic(net, ovs_net_id); + struct datapath *dp, *dp_next; + + genl_lock(); + list_for_each_entry_safe(dp, dp_next, &ovs_net->dps, list_node) + __dp_destroy(dp); + genl_unlock(); +} + +static struct pernet_operations ovs_net_ops = { + .init = ovs_init_net, + .exit = ovs_exit_net, + .id = &ovs_net_id, + .size = sizeof(struct ovs_net), +}; + static int __init dp_init(void) { struct sk_buff *dummy_skb; @@ -1889,10 +1937,14 @@ static int __init dp_init(void) if (err) goto error_flow_exit; - err = register_netdevice_notifier(&ovs_dp_device_notifier); + err = register_pernet_device(&ovs_net_ops); if (err) goto error_vport_exit; + err = register_netdevice_notifier(&ovs_dp_device_notifier); + if (err) + goto error_netns_exit; + err = dp_register_genl(); if (err < 0) goto error_unreg_notifier; @@ -1903,6 +1955,8 @@ static int __init dp_init(void) error_unreg_notifier: unregister_netdevice_notifier(&ovs_dp_device_notifier); +error_netns_exit: + unregister_pernet_device(&ovs_net_ops); error_vport_exit: ovs_vport_exit(); error_flow_exit: @@ -1914,9 +1968,10 @@ error: static void dp_cleanup(void) { cancel_delayed_work_sync(&rehash_flow_wq); - rcu_barrier(); dp_unregister_genl(ARRAY_SIZE(dp_genl_families)); unregister_netdevice_notifier(&ovs_dp_device_notifier); + unregister_pernet_device(&ovs_net_ops); + rcu_barrier(); ovs_vport_exit(); ovs_flow_exit(); } diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h index c1105c1..771c11e 100644 --- a/net/openvswitch/datapath.h +++ b/net/openvswitch/datapath.h @@ -27,8 +27,7 @@ #include <linux/u64_stats_sync.h> #include "flow.h" - -struct vport; +#include "vport.h" #define DP_MAX_PORTS 1024 #define SAMPLE_ACTION_DEPTH 3 @@ -63,6 +62,7 @@ struct dp_stats_percpu { * @port_list: List of all ports in @ports in arbitrary order. RTNL required * to iterate or modify. * @stats_percpu: Per-CPU datapath statistics. + * @net: Reference to net namespace. * * Context: See the comment on locking at the top of datapath.c for additional * locking information. @@ -80,6 +80,11 @@ struct datapath { /* Stats. */ struct dp_stats_percpu __percpu *stats_percpu; + +#ifdef CONFIG_NET_NS + /* Network namespace ref. */ + struct net *net; +#endif }; /** @@ -108,6 +113,16 @@ struct dp_upcall_info { u32 pid; }; +static inline struct net *ovs_dp_get_net(struct datapath *dp) +{ + return read_pnet(&dp->net); +} + +static inline void ovs_dp_set_net(struct datapath *dp, struct net *net) +{ + write_pnet(&dp->net, net); +} + extern struct notifier_block ovs_dp_device_notifier; extern struct genl_multicast_group ovs_dp_vport_multicast_group; diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c index 36dcee8..5558350 100644 --- a/net/openvswitch/dp_notify.c +++ b/net/openvswitch/dp_notify.c @@ -41,19 +41,21 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, case NETDEV_UNREGISTER: if (!ovs_is_internal_dev(dev)) { struct sk_buff *notify; + struct datapath *dp = vport->dp; notify = ovs_vport_cmd_build_info(vport, 0, 0, OVS_VPORT_CMD_DEL); ovs_dp_detach_port(vport); if (IS_ERR(notify)) { - netlink_set_err(init_net.genl_sock, 0, + netlink_set_err(ovs_dp_get_net(dp)->genl_sock, 0, ovs_dp_vport_multicast_group.id, PTR_ERR(notify)); break; } - genlmsg_multicast(notify, 0, ovs_dp_vport_multicast_group.id, - GFP_KERNEL); + genlmsg_multicast_netns(ovs_dp_get_net(dp), notify, 0, + ovs_dp_vport_multicast_group.id, + GFP_KERNEL); } break; } diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c index 4061b9e..5d460c3 100644 --- a/net/openvswitch/vport-internal_dev.c +++ b/net/openvswitch/vport-internal_dev.c @@ -144,7 +144,7 @@ static void do_setup(struct net_device *netdev) netdev->tx_queue_len = 0; netdev->features = NETIF_F_LLTX | NETIF_F_SG | NETIF_F_FRAGLIST | - NETIF_F_HIGHDMA | NETIF_F_HW_CSUM | NETIF_F_TSO; + NETIF_F_HIGHDMA | NETIF_F_HW_CSUM | NETIF_F_TSO; netdev->vlan_features = netdev->features; netdev->features |= NETIF_F_HW_VLAN_TX; @@ -175,9 +175,14 @@ static struct vport *internal_dev_create(const struct vport_parms *parms) goto error_free_vport; } + dev_net_set(netdev_vport->dev, ovs_dp_get_net(vport->dp)); internal_dev = internal_dev_priv(netdev_vport->dev); internal_dev->vport = vport; + /* Restrict bridge port to current netns. */ + if (vport->port_no == OVSP_LOCAL) + netdev_vport->dev->features |= NETIF_F_NETNS_LOCAL; + err = register_netdevice(netdev_vport->dev); if (err) goto error_free_netdev; diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c index 6ea3551..3c1e58b 100644 --- a/net/openvswitch/vport-netdev.c +++ b/net/openvswitch/vport-netdev.c @@ -83,7 +83,7 @@ static struct vport *netdev_create(const struct vport_parms *parms) netdev_vport = netdev_vport_priv(vport); - netdev_vport->dev = dev_get_by_name(&init_net, parms->name); + netdev_vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), parms->name); if (!netdev_vport->dev) { err = -ENODEV; goto error_free_vport; diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c index 6140336..9873ace 100644 --- a/net/openvswitch/vport.c +++ b/net/openvswitch/vport.c @@ -16,10 +16,10 @@ * 02110-1301, USA */ -#include <linux/dcache.h> #include <linux/etherdevice.h> #include <linux/if.h> #include <linux/if_vlan.h> +#include <linux/jhash.h> #include <linux/kernel.h> #include <linux/list.h> #include <linux/mutex.h> @@ -27,7 +27,9 @@ #include <linux/rcupdate.h> #include <linux/rtnetlink.h> #include <linux/compat.h> +#include <net/net_namespace.h> +#include "datapath.h" #include "vport.h" #include "vport-internal_dev.h" @@ -67,9 +69,9 @@ void ovs_vport_exit(void) kfree(dev_table); } -static struct hlist_head *hash_bucket(const char *name) +static struct hlist_head *hash_bucket(struct net *net, const char *name) { - unsigned int hash = full_name_hash(name, strlen(name)); + unsigned int hash = jhash(name, strlen(name), (unsigned long) net); return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)]; } @@ -80,14 +82,15 @@ static struct hlist_head *hash_bucket(const char *name) * * Must be called with RTNL or RCU read lock. */ -struct vport *ovs_vport_locate(const char *name) +struct vport *ovs_vport_locate(struct net *net, const char *name) { - struct hlist_head *bucket = hash_bucket(name); + struct hlist_head *bucket = hash_bucket(net, name); struct vport *vport; struct hlist_node *node; hlist_for_each_entry_rcu(vport, node, bucket, hash_node) - if (!strcmp(name, vport->ops->get_name(vport))) + if (!strcmp(name, vport->ops->get_name(vport)) && + net_eq(ovs_dp_get_net(vport->dp), net)) return vport; return NULL; @@ -170,14 +173,17 @@ struct vport *ovs_vport_add(const struct vport_parms *parms) for (i = 0; i < ARRAY_SIZE(vport_ops_list); i++) { if (vport_ops_list[i]->type == parms->type) { + struct hlist_head *bucket; + vport = vport_ops_list[i]->create(parms); if (IS_ERR(vport)) { err = PTR_ERR(vport); goto out; } - hlist_add_head_rcu(&vport->hash_node, - hash_bucket(vport->ops->get_name(vport))); + bucket = hash_bucket(ovs_dp_get_net(vport->dp), + vport->ops->get_name(vport)); + hlist_add_head_rcu(&vport->hash_node, bucket); return vport; } } diff --git a/net/openvswitch/vport.h b/net/openvswitch/vport.h index aac680c..97cef08 100644 --- a/net/openvswitch/vport.h +++ b/net/openvswitch/vport.h @@ -20,6 +20,7 @@ #define VPORT_H 1 #include <linux/list.h> +#include <linux/netlink.h> #include <linux/openvswitch.h> #include <linux/skbuff.h> #include <linux/spinlock.h> @@ -38,7 +39,7 @@ void ovs_vport_exit(void); struct vport *ovs_vport_add(const struct vport_parms *); void ovs_vport_del(struct vport *); -struct vport *ovs_vport_locate(const char *name); +struct vport *ovs_vport_locate(struct net *net, const char *name); void ovs_vport_get_stats(struct vport *, struct ovs_vport_stats *); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH net-next 2/2] openvswitch: Increase maximum number of datapath ports. [not found] ` <1346786049-3100-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2012-09-04 19:14 ` [PATCH net-next 1/2] openvswitch: Add support for network namespaces Jesse Gross @ 2012-09-04 19:14 ` Jesse Gross 2012-09-04 19:26 ` [GIT net-next] Open vSwitch David Miller 2 siblings, 0 replies; 46+ messages in thread From: Jesse Gross @ 2012-09-04 19:14 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA From: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Use hash table to store ports of datapath. Allow 64K ports per switch. Signed-off-by: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Signed-off-by: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> --- net/openvswitch/actions.c | 2 +- net/openvswitch/datapath.c | 110 +++++++++++++++++++++++++++++++------------- net/openvswitch/datapath.h | 33 ++++++++++--- net/openvswitch/flow.c | 11 ++--- net/openvswitch/flow.h | 3 +- net/openvswitch/vport.c | 1 + net/openvswitch/vport.h | 4 +- 7 files changed, 113 insertions(+), 51 deletions(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index f3f96ba..0da6877 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -266,7 +266,7 @@ static int do_output(struct datapath *dp, struct sk_buff *skb, int out_port) if (unlikely(!skb)) return -ENOMEM; - vport = rcu_dereference(dp->ports[out_port]); + vport = ovs_vport_rcu(dp, out_port); if (unlikely(!vport)) { kfree_skb(skb); return -ENODEV; diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index cad39fc..105a0b5 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -116,7 +116,7 @@ static struct datapath *get_dp(struct net *net, int dp_ifindex) /* Must be called with rcu_read_lock or RTNL lock. */ const char *ovs_dp_name(const struct datapath *dp) { - struct vport *vport = rcu_dereference_rtnl(dp->ports[OVSP_LOCAL]); + struct vport *vport = ovs_vport_rtnl_rcu(dp, OVSP_LOCAL); return vport->ops->get_name(vport); } @@ -127,7 +127,7 @@ static int get_dpifindex(struct datapath *dp) rcu_read_lock(); - local = rcu_dereference(dp->ports[OVSP_LOCAL]); + local = ovs_vport_rcu(dp, OVSP_LOCAL); if (local) ifindex = local->ops->get_ifindex(local); else @@ -145,9 +145,30 @@ static void destroy_dp_rcu(struct rcu_head *rcu) ovs_flow_tbl_destroy((__force struct flow_table *)dp->table); free_percpu(dp->stats_percpu); release_net(ovs_dp_get_net(dp)); + kfree(dp->ports); kfree(dp); } +static struct hlist_head *vport_hash_bucket(const struct datapath *dp, + u16 port_no) +{ + return &dp->ports[port_no & (DP_VPORT_HASH_BUCKETS - 1)]; +} + +struct vport *ovs_lookup_vport(const struct datapath *dp, u16 port_no) +{ + struct vport *vport; + struct hlist_node *n; + struct hlist_head *head; + + head = vport_hash_bucket(dp, port_no); + hlist_for_each_entry_rcu(vport, n, head, dp_hash_node) { + if (vport->port_no == port_no) + return vport; + } + return NULL; +} + /* Called with RTNL lock and genl_lock. */ static struct vport *new_vport(const struct vport_parms *parms) { @@ -156,9 +177,9 @@ static struct vport *new_vport(const struct vport_parms *parms) vport = ovs_vport_add(parms); if (!IS_ERR(vport)) { struct datapath *dp = parms->dp; + struct hlist_head *head = vport_hash_bucket(dp, vport->port_no); - rcu_assign_pointer(dp->ports[parms->port_no], vport); - list_add(&vport->node, &dp->port_list); + hlist_add_head_rcu(&vport->dp_hash_node, head); } return vport; @@ -170,8 +191,7 @@ void ovs_dp_detach_port(struct vport *p) ASSERT_RTNL(); /* First drop references to device. */ - list_del(&p->node); - rcu_assign_pointer(p->dp->ports[p->port_no], NULL); + hlist_del_rcu(&p->dp_hash_node); /* Then destroy it. */ ovs_vport_del(p); @@ -1248,7 +1268,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) struct datapath *dp; struct vport *vport; struct ovs_net *ovs_net; - int err; + int err, i; err = -EINVAL; if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID]) @@ -1261,7 +1281,6 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) if (dp == NULL) goto err_unlock_rtnl; - INIT_LIST_HEAD(&dp->port_list); ovs_dp_set_net(dp, hold_net(sock_net(skb->sk))); /* Allocate table. */ @@ -1276,6 +1295,16 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) goto err_destroy_table; } + dp->ports = kmalloc(DP_VPORT_HASH_BUCKETS * sizeof(struct hlist_head), + GFP_KERNEL); + if (!dp->ports) { + err = -ENOMEM; + goto err_destroy_percpu; + } + + for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) + INIT_HLIST_HEAD(&dp->ports[i]); + /* Set up our datapath device. */ parms.name = nla_data(a[OVS_DP_ATTR_NAME]); parms.type = OVS_VPORT_TYPE_INTERNAL; @@ -1290,7 +1319,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) if (err == -EBUSY) err = -EEXIST; - goto err_destroy_percpu; + goto err_destroy_ports_array; } reply = ovs_dp_cmd_build_info(dp, info->snd_pid, @@ -1309,7 +1338,9 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) return 0; err_destroy_local_port: - ovs_dp_detach_port(rtnl_dereference(dp->ports[OVSP_LOCAL])); + ovs_dp_detach_port(ovs_vport_rtnl(dp, OVSP_LOCAL)); +err_destroy_ports_array: + kfree(dp->ports); err_destroy_percpu: free_percpu(dp->stats_percpu); err_destroy_table: @@ -1326,15 +1357,21 @@ err: /* Called with genl_mutex. */ static void __dp_destroy(struct datapath *dp) { - struct vport *vport, *next_vport; + int i; rtnl_lock(); - list_for_each_entry_safe(vport, next_vport, &dp->port_list, node) - if (vport->port_no != OVSP_LOCAL) - ovs_dp_detach_port(vport); + + for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) { + struct vport *vport; + struct hlist_node *node, *n; + + hlist_for_each_entry_safe(vport, node, n, &dp->ports[i], dp_hash_node) + if (vport->port_no != OVSP_LOCAL) + ovs_dp_detach_port(vport); + } list_del(&dp->list_node); - ovs_dp_detach_port(rtnl_dereference(dp->ports[OVSP_LOCAL])); + ovs_dp_detach_port(ovs_vport_rtnl(dp, OVSP_LOCAL)); /* rtnl_unlock() will wait until all the references to devices that * are pending unregistration have been dropped. We do it here to @@ -1566,7 +1603,7 @@ static struct vport *lookup_vport(struct net *net, if (!dp) return ERR_PTR(-ENODEV); - vport = rcu_dereference_rtnl(dp->ports[port_no]); + vport = ovs_vport_rtnl_rcu(dp, port_no); if (!vport) return ERR_PTR(-ENOENT); return vport; @@ -1603,7 +1640,7 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info) if (port_no >= DP_MAX_PORTS) goto exit_unlock; - vport = rtnl_dereference(dp->ports[port_no]); + vport = ovs_vport_rtnl_rcu(dp, port_no); err = -EBUSY; if (vport) goto exit_unlock; @@ -1613,7 +1650,7 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info) err = -EFBIG; goto exit_unlock; } - vport = rtnl_dereference(dp->ports[port_no]); + vport = ovs_vport_rtnl(dp, port_no); if (!vport) break; } @@ -1755,32 +1792,39 @@ static int ovs_vport_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) { struct ovs_header *ovs_header = genlmsg_data(nlmsg_data(cb->nlh)); struct datapath *dp; - u32 port_no; - int retval; + int bucket = cb->args[0], skip = cb->args[1]; + int i, j = 0; dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); if (!dp) return -ENODEV; rcu_read_lock(); - for (port_no = cb->args[0]; port_no < DP_MAX_PORTS; port_no++) { + for (i = bucket; i < DP_VPORT_HASH_BUCKETS; i++) { struct vport *vport; - - vport = rcu_dereference(dp->ports[port_no]); - if (!vport) - continue; - - if (ovs_vport_cmd_fill_info(vport, skb, NETLINK_CB(cb->skb).pid, - cb->nlh->nlmsg_seq, NLM_F_MULTI, - OVS_VPORT_CMD_NEW) < 0) - break; + struct hlist_node *n; + + j = 0; + hlist_for_each_entry_rcu(vport, n, &dp->ports[i], dp_hash_node) { + if (j >= skip && + ovs_vport_cmd_fill_info(vport, skb, + NETLINK_CB(cb->skb).pid, + cb->nlh->nlmsg_seq, + NLM_F_MULTI, + OVS_VPORT_CMD_NEW) < 0) + goto out; + + j++; + } + skip = 0; } +out: rcu_read_unlock(); - cb->args[0] = port_no; - retval = skb->len; + cb->args[0] = i; + cb->args[1] = j; - return retval; + return skb->len; } static struct genl_ops dp_vport_genl_ops[] = { diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h index 771c11e..129ec54 100644 --- a/net/openvswitch/datapath.h +++ b/net/openvswitch/datapath.h @@ -29,7 +29,9 @@ #include "flow.h" #include "vport.h" -#define DP_MAX_PORTS 1024 +#define DP_MAX_PORTS USHRT_MAX +#define DP_VPORT_HASH_BUCKETS 1024 + #define SAMPLE_ACTION_DEPTH 3 /** @@ -57,10 +59,8 @@ struct dp_stats_percpu { * @list_node: Element in global 'dps' list. * @n_flows: Number of flows currently in flow table. * @table: Current flow table. Protected by genl_lock and RCU. - * @ports: Map from port number to &struct vport. %OVSP_LOCAL port - * always exists, other ports may be %NULL. Protected by RTNL and RCU. - * @port_list: List of all ports in @ports in arbitrary order. RTNL required - * to iterate or modify. + * @ports: Hash table for ports. %OVSP_LOCAL port always exists. Protected by + * RTNL and RCU. * @stats_percpu: Per-CPU datapath statistics. * @net: Reference to net namespace. * @@ -75,8 +75,7 @@ struct datapath { struct flow_table __rcu *table; /* Switch ports. */ - struct vport __rcu *ports[DP_MAX_PORTS]; - struct list_head port_list; + struct hlist_head *ports; /* Stats. */ struct dp_stats_percpu __percpu *stats_percpu; @@ -87,6 +86,26 @@ struct datapath { #endif }; +struct vport *ovs_lookup_vport(const struct datapath *dp, u16 port_no); + +static inline struct vport *ovs_vport_rcu(const struct datapath *dp, int port_no) +{ + WARN_ON_ONCE(!rcu_read_lock_held()); + return ovs_lookup_vport(dp, port_no); +} + +static inline struct vport *ovs_vport_rtnl_rcu(const struct datapath *dp, int port_no) +{ + WARN_ON_ONCE(!rcu_read_lock_held() && !rtnl_is_locked()); + return ovs_lookup_vport(dp, port_no); +} + +static inline struct vport *ovs_vport_rtnl(const struct datapath *dp, int port_no) +{ + ASSERT_RTNL(); + return ovs_lookup_vport(dp, port_no); +} + /** * struct ovs_skb_cb - OVS data in skb CB * @flow: The flow associated with this packet. May be %NULL if no flow. diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index b7f38b1..f9f211d 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -203,10 +203,7 @@ struct sw_flow_actions *ovs_flow_actions_alloc(const struct nlattr *actions) int actions_len = nla_len(actions); struct sw_flow_actions *sfa; - /* At least DP_MAX_PORTS actions are required to be able to flood a - * packet to every port. Factor of 2 allows for setting VLAN tags, - * etc. */ - if (actions_len > 2 * DP_MAX_PORTS * nla_total_size(4)) + if (actions_len > MAX_ACTIONS_BUFSIZE) return ERR_PTR(-EINVAL); sfa = kmalloc(sizeof(*sfa) + actions_len, GFP_KERNEL); @@ -1000,7 +997,7 @@ int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp, swkey->phy.in_port = in_port; attrs &= ~(1 << OVS_KEY_ATTR_IN_PORT); } else { - swkey->phy.in_port = USHRT_MAX; + swkey->phy.in_port = DP_MAX_PORTS; } /* Data attributes. */ @@ -1143,7 +1140,7 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, const struct nlattr *nla; int rem; - *in_port = USHRT_MAX; + *in_port = DP_MAX_PORTS; *priority = 0; nla_for_each_nested(nla, attr, rem) { @@ -1180,7 +1177,7 @@ int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, swkey->phy.priority)) goto nla_put_failure; - if (swkey->phy.in_port != USHRT_MAX && + if (swkey->phy.in_port != DP_MAX_PORTS && nla_put_u32(skb, OVS_KEY_ATTR_IN_PORT, swkey->phy.in_port)) goto nla_put_failure; diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h index 9b75617..d92e22a 100644 --- a/net/openvswitch/flow.h +++ b/net/openvswitch/flow.h @@ -43,7 +43,7 @@ struct sw_flow_actions { struct sw_flow_key { struct { u32 priority; /* Packet QoS priority. */ - u16 in_port; /* Input switch port (or USHRT_MAX). */ + u16 in_port; /* Input switch port (or DP_MAX_PORTS). */ } phy; struct { u8 src[ETH_ALEN]; /* Ethernet source address. */ @@ -161,6 +161,7 @@ int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp, int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, const struct nlattr *); +#define MAX_ACTIONS_BUFSIZE (16 * 1024) #define TBL_MIN_BUCKETS 1024 struct flow_table { diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c index 9873ace..1abd960 100644 --- a/net/openvswitch/vport.c +++ b/net/openvswitch/vport.c @@ -127,6 +127,7 @@ struct vport *ovs_vport_alloc(int priv_size, const struct vport_ops *ops, vport->port_no = parms->port_no; vport->upcall_pid = parms->upcall_pid; vport->ops = ops; + INIT_HLIST_NODE(&vport->dp_hash_node); vport->percpu_stats = alloc_percpu(struct vport_percpu_stats); if (!vport->percpu_stats) { diff --git a/net/openvswitch/vport.h b/net/openvswitch/vport.h index 97cef08..c56e483 100644 --- a/net/openvswitch/vport.h +++ b/net/openvswitch/vport.h @@ -70,10 +70,10 @@ struct vport_err_stats { * @rcu: RCU callback head for deferred destruction. * @port_no: Index into @dp's @ports array. * @dp: Datapath to which this port belongs. - * @node: Element in @dp's @port_list. * @upcall_pid: The Netlink port to use for packets received on this port that * miss the flow table. * @hash_node: Element in @dev_table hash table in vport.c. + * @dp_hash_node: Element in @datapath->ports hash table in datapath.c. * @ops: Class structure. * @percpu_stats: Points to per-CPU statistics used and maintained by vport * @stats_lock: Protects @err_stats; @@ -83,10 +83,10 @@ struct vport { struct rcu_head rcu; u16 port_no; struct datapath *dp; - struct list_head node; u32 upcall_pid; struct hlist_node hash_node; + struct hlist_node dp_hash_node; const struct vport_ops *ops; struct vport_percpu_stats __percpu *percpu_stats; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch [not found] ` <1346786049-3100-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2012-09-04 19:14 ` [PATCH net-next 1/2] openvswitch: Add support for network namespaces Jesse Gross 2012-09-04 19:14 ` [PATCH net-next 2/2] openvswitch: Increase maximum number of datapath ports Jesse Gross @ 2012-09-04 19:26 ` David Miller 2 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2012-09-04 19:26 UTC (permalink / raw) To: jesse-l0M0P4e3n4LQT0dZR+AlfA Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Date: Tue, 4 Sep 2012 12:14:07 -0700 > Two feature additions to Open vSwitch for net-next/3.7. > > The following changes since commit 0d7614f09c1ebdbaa1599a5aba7593f147bf96ee: > > Linux 3.6-rc1 (2012-08-02 16:38:10 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master > > for you to fetch changes up to 15eac2a74277bc7de68a7c2a64a7c91b4b6f5961: > > openvswitch: Increase maximum number of datapath ports. (2012-09-03 19:20:49 -0700) Pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-11-10 3:58 Pravin B Shelar 2014-11-11 18:34 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Pravin B Shelar @ 2014-11-10 3:58 UTC (permalink / raw) To: davem; +Cc: netdev Following batch of patches brings feature parity between upstream ovs and out of tree ovs module. Two features are added, first adds support to export egress tunnel information for a packet. This is used to improve visibility in network traffic. Second feature allows userspace vswitchd process to probe ovs module features. Other patches are optimization and code cleanup. ---------------------------------------------------------------- The following changes since commit c0560b9c523341516eabf0f3b51832256caa7bbb: dccp: Convert DCCP_WARN to net_warn_ratelimited (2014-11-08 21:22:54 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git net_next_ovs for you to fetch changes up to 05da5898a96c05e32aa9850c9cd89eef29471b13: openvswitch: Add support for OVS_FLOW_ATTR_PROBE. (2014-11-09 18:58:44 -0800) ---------------------------------------------------------------- Jarno Rajahalme (1): openvswitch: Add support for OVS_FLOW_ATTR_PROBE. Pravin B Shelar (3): openvswitch: Export symbols as GPL symbols. openvswitch: Optimize recirc action. openvswitch: Remove redundant key ref from upcall_info. Thomas Graf (1): openvswitch: Constify various function arguments Wenyu Zhang (1): openvswitch: Extend packet attribute for egress tunnel info include/uapi/linux/openvswitch.h | 15 ++ net/openvswitch/actions.c | 180 ++++++++++++++------ net/openvswitch/datapath.c | 129 ++++++++------ net/openvswitch/datapath.h | 22 +-- net/openvswitch/flow.c | 8 +- net/openvswitch/flow.h | 71 ++++++-- net/openvswitch/flow_netlink.c | 357 +++++++++++++++++++++++---------------- net/openvswitch/flow_netlink.h | 13 +- net/openvswitch/flow_table.c | 12 +- net/openvswitch/flow_table.h | 8 +- net/openvswitch/vport-geneve.c | 23 ++- net/openvswitch/vport-gre.c | 12 +- net/openvswitch/vport-netdev.c | 2 +- net/openvswitch/vport-vxlan.c | 24 ++- net/openvswitch/vport.c | 81 +++++++-- net/openvswitch/vport.h | 20 ++- 16 files changed, 664 insertions(+), 313 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-11-10 3:58 Pravin B Shelar @ 2014-11-11 18:34 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2014-11-11 18:34 UTC (permalink / raw) To: pshelar; +Cc: netdev From: Pravin B Shelar <pshelar@nicira.com> Date: Sun, 9 Nov 2014 19:58:59 -0800 > Following batch of patches brings feature parity between upstream > ovs and out of tree ovs module. > > Two features are added, first adds support to export egress > tunnel information for a packet. This is used to improve > visibility in network traffic. Second feature allows userspace > vswitchd process to probe ovs module features. Other patches > are optimization and code cleanup. Pulled, thanks Pravin. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-11-04 6:00 Pravin B Shelar 2014-11-05 20:10 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Pravin B Shelar @ 2014-11-04 6:00 UTC (permalink / raw) To: davem; +Cc: netdev First two patches are related to OVS MPLS support. Rest of patches are various refactoring and minor improvements to openvswitch. ---------------------------------------------------------------- The following changes since commit 30349bdbc4da5ecf0efa25556e3caff9c9b8c5f7: net: phy: spi_ks8995: remove sysfs bin file by registered attribute (2014-11-04 17:18:45 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git net_next_ovs for you to fetch changes up to fb90c8d8d5169d4dfbe5896721367e1904638a91: openvswitch: Avoid NULL mask check while building mask (2014-11-04 22:20:33 -0800) ---------------------------------------------------------------- Andy Zhou (2): openvswitch: refactor do_output() to move NULL check out of fast path openvswitch: Refactor get_dp() function into multiple access APIs. Chunhe Li (1): openvswitch: Drop packets when interdev is not up Jarno Rajahalme (1): openvswitch: Fix the type of struct ovs_key_nd nd_target field. Jesse Gross (1): openvswitch: Additional logging for -EINVAL on flow setups. Joe Stringer (3): openvswitch: Remove redundant tcp_flags code. openvswitch: Refactor ovs_flow_cmd_fill_info(). openvswitch: Move key_attr_size() to flow_netlink.h. Lorand Jakab (1): openvswitch: Remove flow member from struct ovs_skb_cb Pravin B Shelar (4): net: Remove MPLS GSO feature. openvswitch: Move table destroy to dp-rcu callback. openvswitch: Refactor action alloc and copy api. openvswitch: Avoid NULL mask check while building mask Simon Horman (1): openvswitch: Add basic MPLS support to kernel include/linux/netdev_features.h | 7 +- include/linux/netdevice.h | 1 - include/linux/skbuff.h | 3 - include/net/mpls.h | 39 +++++ include/uapi/linux/openvswitch.h | 38 ++++- net/core/dev.c | 3 +- net/core/ethtool.c | 1 - net/ipv4/af_inet.c | 1 - net/ipv4/tcp_offload.c | 1 - net/ipv4/udp_offload.c | 3 +- net/ipv6/ip6_offload.c | 1 - net/ipv6/udp_offload.c | 3 +- net/mpls/mpls_gso.c | 3 +- net/openvswitch/Kconfig | 1 + net/openvswitch/actions.c | 136 ++++++++++++--- net/openvswitch/datapath.c | 215 ++++++++++++----------- net/openvswitch/datapath.h | 4 +- net/openvswitch/flow.c | 30 ++++ net/openvswitch/flow.h | 17 +- net/openvswitch/flow_netlink.c | 322 +++++++++++++++++++++++++---------- net/openvswitch/flow_netlink.h | 5 +- net/openvswitch/flow_table.c | 11 +- net/openvswitch/flow_table.h | 2 +- net/openvswitch/vport-internal_dev.c | 5 + 24 files changed, 606 insertions(+), 246 deletions(-) create mode 100644 include/net/mpls.h ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-11-04 6:00 Pravin B Shelar @ 2014-11-05 20:10 ` David Miller 2014-11-05 22:52 ` Pravin Shelar 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2014-11-05 20:10 UTC (permalink / raw) To: pshelar; +Cc: netdev Please do not submit your patches such that the email Date: field is the commit's date. You're not posting these on Nov. 4th, yet that is the Date: field on all of the individual patch emails. I want them to be the date at the time you post the patch to the mailing list. Otherwise the ordering in patchwork is not cronological wrt. the list's postings and this makes my work more difficult than it needs to be. Thanks. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-11-05 20:10 ` David Miller @ 2014-11-05 22:52 ` Pravin Shelar 0 siblings, 0 replies; 46+ messages in thread From: Pravin Shelar @ 2014-11-05 22:52 UTC (permalink / raw) To: David Miller; +Cc: netdev On Wed, Nov 5, 2014 at 12:10 PM, David Miller <davem@davemloft.net> wrote: > > Please do not submit your patches such that the email Date: field is > the commit's date. You're not posting these on Nov. 4th, yet that > is the Date: field on all of the individual patch emails. > > I want them to be the date at the time you post the patch to the mailing > list. > > Otherwise the ordering in patchwork is not cronological wrt. the list's > postings and this makes my work more difficult than it needs to be. > Sorry about the Date field. NTP stopped working on my machine thats why the date got messed up. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-09-11 22:01 Pravin B Shelar 2014-09-11 23:09 ` Pravin Shelar 0 siblings, 1 reply; 46+ messages in thread From: Pravin B Shelar @ 2014-09-11 22:01 UTC (permalink / raw) To: davem; +Cc: netdev Following patches adds recirculation and hash action to OVS. First three patches does code restructuring which is required for last patch. Recirculation implementation is changed, according to comments from David Miller, to avoid using recursive calls in OVS. It is using queue to record recirc action and deferred recirc is executed at the end of current actions execution. ---------------------------------------------------------------- The following changes since commit b954d83421d51d822c42e5ab7b65069b25ad3005: net: bpf: only build bpf_jit_binary_{alloc, free}() when jit selected (2014-09-10 14:05:07 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git net_next_ovs for you to fetch changes up to 9b8ede54a8bd319789e8bceb19789463bb944701: openvswitch: Add recirc and hash action. (2014-09-11 13:35:29 -0700) ---------------------------------------------------------------- Andy Zhou (2): datapath: simplify sample action implementation openvswitch: Add recirc and hash action. Pravin B Shelar (2): datapath: refactor ovs flow extract API. datapath: Use tun_key only for egress tunnel path. include/uapi/linux/openvswitch.h | 26 +++++ net/openvswitch/actions.c | 247 ++++++++++++++++++++++++++++++++++----- net/openvswitch/datapath.c | 52 +++++---- net/openvswitch/datapath.h | 17 ++- net/openvswitch/flow.c | 54 +++++++-- net/openvswitch/flow.h | 10 +- net/openvswitch/flow_netlink.c | 63 +++++++--- net/openvswitch/flow_netlink.h | 4 +- net/openvswitch/vport-gre.c | 22 ++-- net/openvswitch/vport-vxlan.c | 20 ++-- net/openvswitch/vport.c | 13 ++- 11 files changed, 419 insertions(+), 109 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-09-11 22:01 Pravin B Shelar @ 2014-09-11 23:09 ` Pravin Shelar 0 siblings, 0 replies; 46+ messages in thread From: Pravin Shelar @ 2014-09-11 23:09 UTC (permalink / raw) To: David Miller; +Cc: netdev Please ignore this series, I have used datapath as subsystem name rather than openvswitch. I will respin it shortly. On Thu, Sep 11, 2014 at 3:01 PM, Pravin B Shelar <pshelar@nicira.com> wrote: > Following patches adds recirculation and hash action to OVS. > First three patches does code restructuring which is required > for last patch. > Recirculation implementation is changed, according to comments from > David Miller, to avoid using recursive calls in OVS. It is using > queue to record recirc action and deferred recirc is executed at > the end of current actions execution. > > ---------------------------------------------------------------- > The following changes since commit b954d83421d51d822c42e5ab7b65069b25ad3005: > > net: bpf: only build bpf_jit_binary_{alloc, free}() when jit selected (2014-09-10 14:05:07 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git net_next_ovs > > for you to fetch changes up to 9b8ede54a8bd319789e8bceb19789463bb944701: > > openvswitch: Add recirc and hash action. (2014-09-11 13:35:29 -0700) > > ---------------------------------------------------------------- > Andy Zhou (2): > datapath: simplify sample action implementation > openvswitch: Add recirc and hash action. > > Pravin B Shelar (2): > datapath: refactor ovs flow extract API. > datapath: Use tun_key only for egress tunnel path. > > include/uapi/linux/openvswitch.h | 26 +++++ > net/openvswitch/actions.c | 247 ++++++++++++++++++++++++++++++++++----- > net/openvswitch/datapath.c | 52 +++++---- > net/openvswitch/datapath.h | 17 ++- > net/openvswitch/flow.c | 54 +++++++-- > net/openvswitch/flow.h | 10 +- > net/openvswitch/flow_netlink.c | 63 +++++++--- > net/openvswitch/flow_netlink.h | 4 +- > net/openvswitch/vport-gre.c | 22 ++-- > net/openvswitch/vport-vxlan.c | 20 ++-- > net/openvswitch/vport.c | 13 ++- > 11 files changed, 419 insertions(+), 109 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-07-31 23:57 Pravin B Shelar 2014-08-02 22:16 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Pravin B Shelar @ 2014-07-31 23:57 UTC (permalink / raw) To: davem; +Cc: netdev Following patches introduces flow mask cache. To process any packet OVS need to apply flow mask to the flow and lookup the flow in flow table. so packet processing performance is directly dependant on number of entries in mask list. Following patch adds mask cache so that we do not need to iterate over all entries in mask list on every packet. We have seen good performance improvement with this patch. Before the mask-cache, a single stream which matched the first mask got a throughput of about 900K pps. A stream which matched the 20th mask got a throughput of about 400K pps. After the mask-cache patch, all streams throughput went back up to 900K pps. ---------------------------------------------------------------- The following changes since commit 2f55daa5464e8dfc8787ec863b6d1094522dbd69: net: stmmac: Support devicetree configs for mcast and ucast filter entries (2014-07-31 15:31:02 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git net_next_ovs for you to fetch changes up to 4955f0f9cbefa73577cd30ec262538ffc73dd4c2: openvswitch: Introduce flow mask cache. (2014-07-31 15:49:55 -0700) ---------------------------------------------------------------- Pravin B Shelar (3): openvswitch: Move table destroy to dp-rcu callback. openvswitch: Convert mask list into mask array. openvswitch: Introduce flow mask cache. net/openvswitch/datapath.c | 8 +- net/openvswitch/flow.h | 1 - net/openvswitch/flow_table.c | 293 +++++++++++++++++++++++++++++++++++++------ net/openvswitch/flow_table.h | 21 +++- 4 files changed, 275 insertions(+), 48 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-07-31 23:57 Pravin B Shelar @ 2014-08-02 22:16 ` David Miller 2014-08-03 19:20 ` Pravin Shelar 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2014-08-02 22:16 UTC (permalink / raw) To: pshelar; +Cc: netdev From: Pravin B Shelar <pshelar@nicira.com> Date: Thu, 31 Jul 2014 16:57:37 -0700 > Following patch adds mask cache so that we do not need to iterate over > all entries in mask list on every packet. We have seen good performance > improvement with this patch. How much have you thought about the DoS'ability of openvswitch's datastructures? What are the upper bounds for performance of packet switching? To be quite honest, a lot of the openvswitch data structures adjustments that hit my inbox seem to me to only address specific situations that specific user configurations have run into. It took us two decades, but we ripped out the ipv4 routing cache because external entities could provoke unreasonable worst case behavior in routing lookups. With openvswitch you guys have a unique opportunity to try and design all of your features such that they absolutely can use scalable datastructures from the beginning that provide reasonable performance in the common case and precise upper bounds for any possible sequence of incoming packets. New features tend to blind the developer to the eventual long term ramifications on performance. Would you add a new feature if you could know ahead of time that you'll never be able to design a datastructure which supports that feature and is not DoS'able by a remote entity? ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-08-02 22:16 ` David Miller @ 2014-08-03 19:20 ` Pravin Shelar 2014-08-04 4:21 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Pravin Shelar @ 2014-08-03 19:20 UTC (permalink / raw) To: David Miller; +Cc: netdev On Sat, Aug 2, 2014 at 3:16 PM, David Miller <davem@davemloft.net> wrote: > From: Pravin B Shelar <pshelar@nicira.com> > Date: Thu, 31 Jul 2014 16:57:37 -0700 > >> Following patch adds mask cache so that we do not need to iterate over >> all entries in mask list on every packet. We have seen good performance >> improvement with this patch. > > How much have you thought about the DoS'ability of openvswitch's > datastructures? > This cache is populated by flow lookup in fast path. The mask cache is fixed in size. Userspace or number of packets can not change its size. Memory is statically allocated, no garbage collection. So DoS attack can not exploit this cache to increase ovs memory footprint. > What are the upper bounds for performance of packet switching? > Cache is keyed on packet RSS. Worst case scenario this cache adds one extra flow-table lookup for the flow if RSS hash matches but packet belong to different flow (hash collision). This is designed to be lightweight, stateless cache (does not take any reference on other data structures) to have least impact on DoS'ability of Open vSwitch. > To be quite honest, a lot of the openvswitch data structures > adjustments that hit my inbox seem to me to only address specific > situations that specific user configurations have run into. > Overall OVS DoS defense has improved since introduction of mega-flow. Recently introduced OVS feature allows userspace to set multiple sockets for upcall processing for given vport. This adds fairness by separating upcall from different flows to a socket. Userspace process upcall from these sockets in round-robin fashion. This helps to avoid one port monopolize upcall communication path. I agree there is scope for improving DoS defense and we will keep working on this issue. > It took us two decades, but we ripped out the ipv4 routing cache > because external entities could provoke unreasonable worst case > behavior in routing lookups. > > With openvswitch you guys have a unique opportunity to try and design > all of your features such that they absolutely can use scalable > datastructures from the beginning that provide reasonable performance > in the common case and precise upper bounds for any possible sequence > of incoming packets. > > New features tend to blind the developer to the eventual long term > ramifications on performance. Would you add a new feature if you > could know ahead of time that you'll never be able to design a > datastructure which supports that feature and is not DoS'able by a > remote entity? > ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-08-03 19:20 ` Pravin Shelar @ 2014-08-04 4:21 ` David Miller 2014-08-04 19:35 ` Pravin Shelar 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2014-08-04 4:21 UTC (permalink / raw) To: pshelar; +Cc: netdev From: Pravin Shelar <pshelar@nicira.com> Date: Sun, 3 Aug 2014 12:20:32 -0700 > On Sat, Aug 2, 2014 at 3:16 PM, David Miller <davem@davemloft.net> wrote: >> From: Pravin B Shelar <pshelar@nicira.com> >> Date: Thu, 31 Jul 2014 16:57:37 -0700 >> >>> Following patch adds mask cache so that we do not need to iterate over >>> all entries in mask list on every packet. We have seen good performance >>> improvement with this patch. >> >> How much have you thought about the DoS'ability of openvswitch's >> datastructures? >> > This cache is populated by flow lookup in fast path. The mask cache is > fixed in size. Userspace or number of packets can not change its size. > Memory is statically allocated, no garbage collection. So DoS attack > can not exploit this cache to increase ovs memory footprint. An attacker can construct a packet sequence such that every mask cache lookup misses, making the cache effectively useless. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-08-04 4:21 ` David Miller @ 2014-08-04 19:35 ` Pravin Shelar 2014-08-04 19:42 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Pravin Shelar @ 2014-08-04 19:35 UTC (permalink / raw) To: David Miller; +Cc: netdev On Sun, Aug 3, 2014 at 9:21 PM, David Miller <davem@davemloft.net> wrote: > From: Pravin Shelar <pshelar@nicira.com> > Date: Sun, 3 Aug 2014 12:20:32 -0700 > >> On Sat, Aug 2, 2014 at 3:16 PM, David Miller <davem@davemloft.net> wrote: >>> From: Pravin B Shelar <pshelar@nicira.com> >>> Date: Thu, 31 Jul 2014 16:57:37 -0700 >>> >>>> Following patch adds mask cache so that we do not need to iterate over >>>> all entries in mask list on every packet. We have seen good performance >>>> improvement with this patch. >>> >>> How much have you thought about the DoS'ability of openvswitch's >>> datastructures? >>> >> This cache is populated by flow lookup in fast path. The mask cache is >> fixed in size. Userspace or number of packets can not change its size. >> Memory is statically allocated, no garbage collection. So DoS attack >> can not exploit this cache to increase ovs memory footprint. > > An attacker can construct a packet sequence such that every mask cache > lookup misses, making the cache effectively useless. Yes, but it does work in normal traffic situations. I have posted performance numbers in the cover letter. Under DoS attack as you said attacker need to build sequence of packets to make cache ineffective. Which results in cache miss and a full in-kernel flow lookup. Therefore with this cache there is one more lookup done under DoS. But this is not very different than current OVS anyways. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-08-04 19:35 ` Pravin Shelar @ 2014-08-04 19:42 ` David Miller 2014-08-06 22:55 ` Alexei Starovoitov 2014-08-13 13:34 ` Nicolas Dichtel 0 siblings, 2 replies; 46+ messages in thread From: David Miller @ 2014-08-04 19:42 UTC (permalink / raw) To: pshelar; +Cc: netdev From: Pravin Shelar <pshelar@nicira.com> Date: Mon, 4 Aug 2014 12:35:59 -0700 > On Sun, Aug 3, 2014 at 9:21 PM, David Miller <davem@davemloft.net> wrote: >> From: Pravin Shelar <pshelar@nicira.com> >> Date: Sun, 3 Aug 2014 12:20:32 -0700 >> >>> On Sat, Aug 2, 2014 at 3:16 PM, David Miller <davem@davemloft.net> wrote: >>>> From: Pravin B Shelar <pshelar@nicira.com> >>>> Date: Thu, 31 Jul 2014 16:57:37 -0700 >>>> >>>>> Following patch adds mask cache so that we do not need to iterate over >>>>> all entries in mask list on every packet. We have seen good performance >>>>> improvement with this patch. >>>> >>>> How much have you thought about the DoS'ability of openvswitch's >>>> datastructures? >>>> >>> This cache is populated by flow lookup in fast path. The mask cache is >>> fixed in size. Userspace or number of packets can not change its size. >>> Memory is statically allocated, no garbage collection. So DoS attack >>> can not exploit this cache to increase ovs memory footprint. >> >> An attacker can construct a packet sequence such that every mask cache >> lookup misses, making the cache effectively useless. > > Yes, but it does work in normal traffic situations. You're basically just reiterating the point I'm trying to make. Your caches are designed for specific configuration and packet traffic pattern cases, and can be easily duped into a worse case performance scenerio by an attacker. Caches, basically, do not work on the real internet. Make the fundamental core data structures fast and scalable enough, rather than bolting caches (which are basically hacks) on top every time they don't perform to your expectations. What if you made the full flow lookup fundamentally faster? Then an attacker can't do anything about that. That's a real performance improvement, one that sustains arbitrary traffic patterns. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-08-04 19:42 ` David Miller @ 2014-08-06 22:55 ` Alexei Starovoitov 2014-08-13 13:34 ` Nicolas Dichtel 1 sibling, 0 replies; 46+ messages in thread From: Alexei Starovoitov @ 2014-08-06 22:55 UTC (permalink / raw) To: David Miller; +Cc: Pravin Shelar, netdev@vger.kernel.org On Mon, Aug 4, 2014 at 12:42 PM, David Miller <davem@davemloft.net> wrote: > From: Pravin Shelar <pshelar@nicira.com> > Date: Mon, 4 Aug 2014 12:35:59 -0700 > >> On Sun, Aug 3, 2014 at 9:21 PM, David Miller <davem@davemloft.net> wrote: >>> An attacker can construct a packet sequence such that every mask cache >>> lookup misses, making the cache effectively useless. >> >> Yes, but it does work in normal traffic situations. > > You're basically just reiterating the point I'm trying to make. > > Your caches are designed for specific configuration and packet traffic > pattern cases, and can be easily duped into a worse case performance > scenerio by an attacker. > > Caches, basically, do not work on the real internet. > > Make the fundamental core data structures fast and scalable enough, > rather than bolting caches (which are basically hacks) on top every > time they don't perform to your expectations. > > What if you made the full flow lookup fundamentally faster? Then an I suspect that the flow lookup in ovs is as fast as it can be, yet ovs is still dos-able, since kernel datapath (flow lookup and action) is considered to be first level cache for user space. By design flow miss is always punted to userspace. Therefore netperf TCP_CRR test from a VM is not cheap for host userspace component. Mega-flows and multiple upcall pids are partially addressing this fundamental problem. Consider simple distributed virtual bridge with VMs distributed across multiple hosts. Mega-flow mask that selects dmac can solve CRR case for well behaving VMs, but rogue VM that spams random dmac will keep taxing host userspace. So we'd need to add another flow mask to match the rest of traffic unconditionally and drop it. Now consider virtual bridge-router-bridge topology (two subnets and router using openstack names). Since VMs on the same host may be in different subnets their macs can be the same, so 'mega-flow mask dmac' approach won't work and CRR test again is getting costly to userspace. We can try to use 'in_port + dmac' mask, but as network topology grows the flow mask tricks get out of hand. Situation is worse when ovs works as gateway and receives internet traffic. Since flow miss goes to userspace remote attacker can find a way to saturate gateway cpu with certain traffic. I guess none of this is new to ovs and there is probably a solution that I don't know about. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-08-04 19:42 ` David Miller 2014-08-06 22:55 ` Alexei Starovoitov @ 2014-08-13 13:34 ` Nicolas Dichtel 1 sibling, 0 replies; 46+ messages in thread From: Nicolas Dichtel @ 2014-08-13 13:34 UTC (permalink / raw) To: David Miller, pshelar; +Cc: netdev Le 04/08/2014 21:42, David Miller a écrit : [snip] > Caches, basically, do not work on the real internet. A bit late, but I completely agree! ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-07-14 0:12 Pravin B Shelar 0 siblings, 0 replies; 46+ messages in thread From: Pravin B Shelar @ 2014-07-14 0:12 UTC (permalink / raw) To: davem; +Cc: netdev Following patches adds three features to OVS 1. Add fairness to upcall processing. 2. Recirculation and Hash action. 3. Enable Tunnel GSO features. Rest of patches are bug fixes related to patches from same series. The following changes since commit 279f64b7a771d84cbdea51ac2f794becfb06bcd4: net/hsr: Remove left-over never-true conditional code. (2014-07-11 15:04:40 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git net_next_ovs for you to fetch changes up to 3dec4774b6343e5db5d346f1f05c6a883e0069db: openvswitch: Add skb_clone NULL check for the sampling action. (2014-07-13 12:02:12 -0700) ---------------------------------------------------------------- Alex Wang (1): openvswitch: Allow each vport to have an array of 'port_id's. Andy Zhou (6): openvswitch: Add hash action openvswitch: Add recirc action openvswitch: Fix key size computation in key_attr_size() openvswitch: Avoid memory corruption in queue_userspace_packet() openvswitch: Add skb_clone NULL check in the recirc action. openvswitch: Add skb_clone NULL check for the sampling action. Pravin B Shelar (2): openvswitch: Enable tunnel GSO for OVS bridge. net: Export xmit_recursion Simon Horman (2): openvswitch: Free skb(s) on recirculation error openvswitch: Sample action without side effects include/linux/netdev_features.h | 8 +++ include/linux/netdevice.h | 3 + include/uapi/linux/openvswitch.h | 39 ++++++++++-- net/core/dev.c | 10 +-- net/openvswitch/actions.c | 119 +++++++++++++++++++++++++++++++---- net/openvswitch/datapath.c | 79 ++++++++++++++++------- net/openvswitch/datapath.h | 8 ++- net/openvswitch/flow.h | 2 + net/openvswitch/flow_netlink.c | 43 ++++++++++++- net/openvswitch/vport-internal_dev.c | 5 +- net/openvswitch/vport.c | 102 +++++++++++++++++++++++++++++- net/openvswitch/vport.h | 27 ++++++-- 12 files changed, 393 insertions(+), 52 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-05-20 8:59 Pravin B Shelar 2014-05-23 18:46 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Pravin B Shelar @ 2014-05-20 8:59 UTC (permalink / raw) To: davem-fT/PcQaiUtIeIZ0/mPfg9Q Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA A set of OVS changes for net-next/3.16. Most of change are related to improving performance of flow setup by minimizing critical sections. The following changes since commit 091b64868b43ed84334c6623ea6a08497529d4ff: Merge branch 'mlx4-next' (2014-05-22 17:17:34 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch.git master for you to fetch changes up to 0c200ef94c9492205e18a18c25650cf27939889c: openvswitch: Simplify genetlink code. (2014-05-22 16:27:37 -0700) ---------------------------------------------------------------- Jarno Rajahalme (12): openvswitch: Compact sw_flow_key. openvswitch: Avoid assigning a NULL pointer to flow actions. openvswitch: Clarify locking. openvswitch: Build flow cmd netlink reply only if needed. openvswitch: Make flow mask removal symmetric. openvswitch: Minimize dp and vport critical sections. openvswitch: Fix typo. openvswitch: Fix ovs_flow_stats_get/clear RCU dereference. openvswitch: Reduce locking requirements. openvswitch: Minimize ovs_flow_cmd_del critical section. openvswitch: Split ovs_flow_cmd_new_or_set(). openvswitch: Minimize ovs_flow_cmd_new|set critical sections. Pravin B Shelar (1): openvswitch: Simplify genetlink code. include/uapi/linux/openvswitch.h | 4 +- net/openvswitch/datapath.c | 771 +++++++++++++++++++++++---------------- net/openvswitch/flow.c | 53 ++- net/openvswitch/flow.h | 35 +- net/openvswitch/flow_netlink.c | 112 ++---- net/openvswitch/flow_table.c | 46 ++- 6 files changed, 558 insertions(+), 463 deletions(-) -- 1.9.0 ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2014-05-20 8:59 Pravin B Shelar @ 2014-05-23 18:46 ` David Miller [not found] ` <20140523.144618.48288319728715940.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2014-05-23 18:46 UTC (permalink / raw) To: pshelar; +Cc: dev, netdev From: Pravin B Shelar <pshelar@nicira.com> Date: Tue, 20 May 2014 01:59:38 -0700 > A set of OVS changes for net-next/3.16. > > Most of change are related to improving performance of flow setup by > minimizing critical sections. Pulled, thanks Pravin. In the future please make your postings so that they have the current date and time when you make the postings, not when the commits when into your tree. Otherwise it messed up the order in which your changes appear in patchwork wrt. other submissions. ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <20140523.144618.48288319728715940.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <20140523.144618.48288319728715940.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2014-05-23 20:16 ` Pravin Shelar 0 siblings, 0 replies; 46+ messages in thread From: Pravin Shelar @ 2014-05-23 20:16 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev On Fri, May 23, 2014 at 11:46 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote: > From: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> > Date: Tue, 20 May 2014 01:59:38 -0700 > >> A set of OVS changes for net-next/3.16. >> >> Most of change are related to improving performance of flow setup by >> minimizing critical sections. > > Pulled, thanks Pravin. > > In the future please make your postings so that they have the current date > and time when you make the postings, not when the commits when into your > tree. > > Otherwise it messed up the order in which your changes appear in patchwork > wrt. other submissions. ok, I will keep it in mind. Thanks, Pravin. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-05-16 21:07 Jesse Gross [not found] ` <1400274459-56304-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2014-05-16 21:07 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA A set of OVS changes for net-next/3.16. The major change here is a switch from per-CPU to per-NUMA flow statistics. This improves scalability by reducing kernel overhead in flow setup and maintenance. The following changes since commit a188a54d11629bef2169052297e61f3767ca8ce5: macvlan: simplify the structure port (2014-05-15 23:35:16 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 944df8ae84d88f5e8eb027990dad2cfa4fbe4be5: net/openvswitch: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c (2014-05-16 13:40:29 -0700) ---------------------------------------------------------------- Daniele Di Proietto (4): openvswitch: use const in some local vars and casts openvswitch: avoid warnings in vport_from_priv openvswitch: avoid cast-qual warning in vport_priv openvswitch: Added (unsigned long long) cast in printf Jarno Rajahalme (4): openvswitch: Remove 5-tuple optimization. openvswitch: Per NUMA node flow stats. openvswitch: Fix output of SCTP mask. openvswitch: Use TCP flags in the flow key for stats. Joe Perches (3): openvswitch: Use net_ratelimit in OVS_NLERR openvswitch: flow_netlink: Use pr_fmt to OVS_NLERR output openvswitch: Use ether_addr_copy Monam Agarwal (1): net/openvswitch: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c net/openvswitch/actions.c | 4 +- net/openvswitch/datapath.c | 11 ++- net/openvswitch/datapath.h | 8 ++- net/openvswitch/flow.c | 149 ++++++++++++++++++++++++----------------- net/openvswitch/flow.h | 18 +++-- net/openvswitch/flow_netlink.c | 82 +++++------------------ net/openvswitch/flow_netlink.h | 1 - net/openvswitch/flow_table.c | 75 ++++++++++++--------- net/openvswitch/flow_table.h | 4 +- net/openvswitch/vport-gre.c | 2 +- net/openvswitch/vport.h | 6 +- 11 files changed, 176 insertions(+), 184 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <1400274459-56304-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <1400274459-56304-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> @ 2014-05-16 21:22 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2014-05-16 21:22 UTC (permalink / raw) To: jesse-l0M0P4e3n4LQT0dZR+AlfA Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Date: Fri, 16 May 2014 14:07:27 -0700 > A set of OVS changes for net-next/3.16. > > The major change here is a switch from per-CPU to per-NUMA flow > statistics. This improves scalability by reducing kernel overhead > in flow setup and maintenance. Pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2014-01-07 0:15 Jesse Gross [not found] ` <1389053776-62865-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2014-01-08 14:49 ` [ovs-dev] " Zoltan Kiss 0 siblings, 2 replies; 46+ messages in thread From: Jesse Gross @ 2014-01-07 0:15 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA Open vSwitch changes for net-next/3.14. Highlights are: * Performance improvements in the mechanism to get packets to userspace using memory mapped netlink and skb zero copy where appropriate. * Per-cpu flow stats in situations where flows are likely to be shared across CPUs. Standard flow stats are used in other situations to save memory and allocation time. * A handful of code cleanups and rationalization. The following changes since commit 6ce4eac1f600b34f2f7f58f9cd8f0503d79e42ae: Linux 3.13-rc1 (2013-11-22 11:30:55 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 443cd88c8a31379e95326428bbbd40af25c1d440: ovs: make functions local (2014-01-06 15:54:39 -0800) ---------------------------------------------------------------- Andy Zhou (1): openvswitch: Change ovs_flow_tbl_lookup_xx() APIs Ben Pfaff (2): openvswitch: Correct comment. openvswitch: Shrink sw_flow_mask by 8 bytes (64-bit) or 4 bytes (32-bit). Daniel Borkmann (1): net: ovs: use kfree_rcu instead of rcu_free_{sw_flow_mask_cb,acts_callback} Jesse Gross (1): openvswitch: Silence RCU lockdep checks from flow lookup. Pravin B Shelar (1): openvswitch: Per cpu flow stats. Stephen Hemminger (1): ovs: make functions local Thomas Graf (9): genl: Add genlmsg_new_unicast() for unicast message allocation netlink: Avoid netlink mmap alloc if msg size exceeds frame size openvswitch: Enable memory mapped Netlink i/o net: Export skb_zerocopy() to zerocopy from one skb to another openvswitch: Allow user space to announce ability to accept unaligned Netlink messages openvswitch: Drop user features if old user space attempted to create datapath openvswitch: Pass datapath into userspace queue functions openvswitch: Use skb_zerocopy() for upcall openvswitch: Compute checksum in skb_gso_segment() if needed Wei Yongjun (1): openvswitch: remove duplicated include from flow_table.c include/linux/skbuff.h | 3 + include/net/genetlink.h | 4 + include/uapi/linux/openvswitch.h | 14 ++- net/core/skbuff.c | 85 +++++++++++++ net/netfilter/nfnetlink_queue_core.c | 59 +-------- net/netlink/af_netlink.c | 4 + net/netlink/genetlink.c | 21 ++++ net/openvswitch/datapath.c | 231 +++++++++++++++++++---------------- net/openvswitch/datapath.h | 6 +- net/openvswitch/flow.c | 96 +++++++++++++-- net/openvswitch/flow.h | 33 +++-- net/openvswitch/flow_netlink.c | 66 ++++++++-- net/openvswitch/flow_netlink.h | 1 + net/openvswitch/flow_table.c | 60 ++++++--- net/openvswitch/flow_table.h | 6 +- net/openvswitch/vport.c | 6 +- net/openvswitch/vport.h | 1 - 17 files changed, 483 insertions(+), 213 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <1389053776-62865-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <1389053776-62865-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> @ 2014-01-07 0:49 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2014-01-07 0:49 UTC (permalink / raw) To: jesse-l0M0P4e3n4LQT0dZR+AlfA Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Date: Mon, 6 Jan 2014 16:15:59 -0800 > Open vSwitch changes for net-next/3.14. Highlights are: > * Performance improvements in the mechanism to get packets to userspace > using memory mapped netlink and skb zero copy where appropriate. > * Per-cpu flow stats in situations where flows are likely to be shared > across CPUs. Standard flow stats are used in other situations to save > memory and allocation time. > * A handful of code cleanups and rationalization. Lots of good stuff in here, pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [ovs-dev] [GIT net-next] Open vSwitch 2014-01-07 0:15 Jesse Gross [not found] ` <1389053776-62865-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> @ 2014-01-08 14:49 ` Zoltan Kiss [not found] ` <52CD657F.7080806-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 46+ messages in thread From: Zoltan Kiss @ 2014-01-08 14:49 UTC (permalink / raw) To: Jesse Gross, David Miller; +Cc: dev, netdev Hi, I've tried the latest net-next on a Xenserver install with 1.9.3 userspace, and it seems this patch series broke it (at least after reverting that locally it works now). I haven't went too far yet checking what's the problem, but it seems the xenbrX device doesn't really receive too much of the traffic coming through the NIC. Is it expected? Regards, Zoli On 07/01/14 00:15, Jesse Gross wrote: > Open vSwitch changes for net-next/3.14. Highlights are: > * Performance improvements in the mechanism to get packets to userspace > using memory mapped netlink and skb zero copy where appropriate. > * Per-cpu flow stats in situations where flows are likely to be shared > across CPUs. Standard flow stats are used in other situations to save > memory and allocation time. > * A handful of code cleanups and rationalization. > > The following changes since commit 6ce4eac1f600b34f2f7f58f9cd8f0503d79e42ae: > > Linux 3.13-rc1 (2013-11-22 11:30:55 -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master > > for you to fetch changes up to 443cd88c8a31379e95326428bbbd40af25c1d440: > > ovs: make functions local (2014-01-06 15:54:39 -0800) > > ---------------------------------------------------------------- > Andy Zhou (1): > openvswitch: Change ovs_flow_tbl_lookup_xx() APIs > > Ben Pfaff (2): > openvswitch: Correct comment. > openvswitch: Shrink sw_flow_mask by 8 bytes (64-bit) or 4 bytes (32-bit). > > Daniel Borkmann (1): > net: ovs: use kfree_rcu instead of rcu_free_{sw_flow_mask_cb,acts_callback} > > Jesse Gross (1): > openvswitch: Silence RCU lockdep checks from flow lookup. > > Pravin B Shelar (1): > openvswitch: Per cpu flow stats. > > Stephen Hemminger (1): > ovs: make functions local > > Thomas Graf (9): > genl: Add genlmsg_new_unicast() for unicast message allocation > netlink: Avoid netlink mmap alloc if msg size exceeds frame size > openvswitch: Enable memory mapped Netlink i/o > net: Export skb_zerocopy() to zerocopy from one skb to another > openvswitch: Allow user space to announce ability to accept unaligned Netlink messages > openvswitch: Drop user features if old user space attempted to create datapath > openvswitch: Pass datapath into userspace queue functions > openvswitch: Use skb_zerocopy() for upcall > openvswitch: Compute checksum in skb_gso_segment() if needed > > Wei Yongjun (1): > openvswitch: remove duplicated include from flow_table.c > > include/linux/skbuff.h | 3 + > include/net/genetlink.h | 4 + > include/uapi/linux/openvswitch.h | 14 ++- > net/core/skbuff.c | 85 +++++++++++++ > net/netfilter/nfnetlink_queue_core.c | 59 +-------- > net/netlink/af_netlink.c | 4 + > net/netlink/genetlink.c | 21 ++++ > net/openvswitch/datapath.c | 231 +++++++++++++++++++---------------- > net/openvswitch/datapath.h | 6 +- > net/openvswitch/flow.c | 96 +++++++++++++-- > net/openvswitch/flow.h | 33 +++-- > net/openvswitch/flow_netlink.c | 66 ++++++++-- > net/openvswitch/flow_netlink.h | 1 + > net/openvswitch/flow_table.c | 60 ++++++--- > net/openvswitch/flow_table.h | 6 +- > net/openvswitch/vport.c | 6 +- > net/openvswitch/vport.h | 1 - > 17 files changed, 483 insertions(+), 213 deletions(-) > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <52CD657F.7080806-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <52CD657F.7080806-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> @ 2014-01-08 15:10 ` Jesse Gross 2014-01-13 18:04 ` [ovs-dev] " Zoltan Kiss 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2014-01-08 15:10 UTC (permalink / raw) To: Zoltan Kiss Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, David Miller On Wed, Jan 8, 2014 at 9:49 AM, Zoltan Kiss <zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> wrote: > Hi, > > I've tried the latest net-next on a Xenserver install with 1.9.3 userspace, > and it seems this patch series broke it (at least after reverting that > locally it works now). I haven't went too far yet checking what's the > problem, but it seems the xenbrX device doesn't really receive too much of > the traffic coming through the NIC. Is it expected? What do you mean by doesn't receive too much traffic? What does it get? ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [ovs-dev] [GIT net-next] Open vSwitch 2014-01-08 15:10 ` Jesse Gross @ 2014-01-13 18:04 ` Zoltan Kiss [not found] ` <52D42A9E.1030805-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: Zoltan Kiss @ 2014-01-13 18:04 UTC (permalink / raw) To: Jesse Gross; +Cc: David Miller, dev@openvswitch.org, netdev On 08/01/14 15:10, Jesse Gross wrote: > On Wed, Jan 8, 2014 at 9:49 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: >> Hi, >> >> I've tried the latest net-next on a Xenserver install with 1.9.3 userspace, >> and it seems this patch series broke it (at least after reverting that >> locally it works now). I haven't went too far yet checking what's the >> problem, but it seems the xenbrX device doesn't really receive too much of >> the traffic coming through the NIC. Is it expected? > > What do you mean by doesn't receive too much traffic? What does it get? > Sorry for the vague error description, now I had more time to look into this. I think the problem boils down to this: Jan 13 17:55:07 localhost ovs-vswitchd: 07890|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:274, type=29(ovs_packet), flags=0, seq=0, pid=0,genl(cmd=1,version=1) Jan 13 17:55:07 localhost ovs-vswitchd: 07891|netlink|DBG|attributes followed by garbage Jan 13 17:55:07 localhost ovs-vswitchd: 07892|dpif|WARN|system@xenbr0: recv failed (Invalid argument) That's just keep repeating. I'm keep looking. Zoli ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <52D42A9E.1030805-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <52D42A9E.1030805-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> @ 2014-01-14 0:31 ` Zoltan Kiss [not found] ` <52D4857C.7020902-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: Zoltan Kiss @ 2014-01-14 0:31 UTC (permalink / raw) To: Jesse Gross Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, David Miller On 13/01/14 18:04, Zoltan Kiss wrote: > On 08/01/14 15:10, Jesse Gross wrote: >> On Wed, Jan 8, 2014 at 9:49 AM, Zoltan Kiss <zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> >> wrote: >>> Hi, >>> >>> I've tried the latest net-next on a Xenserver install with 1.9.3 >>> userspace, >>> and it seems this patch series broke it (at least after reverting that >>> locally it works now). I haven't went too far yet checking what's the >>> problem, but it seems the xenbrX device doesn't really receive too >>> much of >>> the traffic coming through the NIC. Is it expected? >> >> What do you mean by doesn't receive too much traffic? What does it get? >> > > Sorry for the vague error description, now I had more time to look > into this. I think the problem boils down to this: > > Jan 13 17:55:07 localhost ovs-vswitchd: > 07890|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:274, > type=29(ovs_packet), flags=0, seq=0, pid=0,genl(cmd=1,version=1) > Jan 13 17:55:07 localhost ovs-vswitchd: 07891|netlink|DBG|attributes > followed by garbage > Jan 13 17:55:07 localhost ovs-vswitchd: 07892|dpif|WARN|system@xenbr0: > recv failed (Invalid argument) > > That's just keep repeating. I'm keep looking. Now I reverted these top 3 commits: ovs: make functions local openvswitch: Compute checksum in skb_gso_segment() if needed openvswitch: Use skb_zerocopy() for upcall And it works. I guess the last one causing the problem. Might be an important factor, I'm using 32 bit Dom0. Zoli ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <52D4857C.7020902-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <52D4857C.7020902-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> @ 2014-01-14 1:30 ` Jesse Gross [not found] ` <CAEP_g=8nG6AHV9Y+5=48nPhkf5Oe=mG8EiyaKSqN4omnmGhv4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2014-01-14 1:30 UTC (permalink / raw) To: Zoltan Kiss Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, David Miller On Mon, Jan 13, 2014 at 4:31 PM, Zoltan Kiss <zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> wrote: > On 13/01/14 18:04, Zoltan Kiss wrote: >> >> On 08/01/14 15:10, Jesse Gross wrote: >>> >>> On Wed, Jan 8, 2014 at 9:49 AM, Zoltan Kiss <zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> >>> wrote: >>>> >>>> Hi, >>>> >>>> I've tried the latest net-next on a Xenserver install with 1.9.3 >>>> userspace, >>>> and it seems this patch series broke it (at least after reverting that >>>> locally it works now). I haven't went too far yet checking what's the >>>> problem, but it seems the xenbrX device doesn't really receive too much >>>> of >>>> the traffic coming through the NIC. Is it expected? >>> >>> >>> What do you mean by doesn't receive too much traffic? What does it get? >>> >> >> Sorry for the vague error description, now I had more time to look into >> this. I think the problem boils down to this: >> >> Jan 13 17:55:07 localhost ovs-vswitchd: >> 07890|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:274, >> type=29(ovs_packet), flags=0, seq=0, pid=0,genl(cmd=1,version=1) >> Jan 13 17:55:07 localhost ovs-vswitchd: 07891|netlink|DBG|attributes >> followed by garbage >> Jan 13 17:55:07 localhost ovs-vswitchd: 07892|dpif|WARN|system@xenbr0: >> recv failed (Invalid argument) >> >> That's just keep repeating. I'm keep looking. > > > Now I reverted these top 3 commits: > > ovs: make functions local > > openvswitch: Compute checksum in skb_gso_segment() if needed > openvswitch: Use skb_zerocopy() for upcall > > And it works. I guess the last one causing the problem. Might be an > important factor, I'm using 32 bit Dom0. I think you're probably right. Thomas - can you take a look? We shouldn't be doing any zerocopy in this situation but it looks to me like we don't do any padding at all, even in situations where we are copying the data. ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <CAEP_g=8nG6AHV9Y+5=48nPhkf5Oe=mG8EiyaKSqN4omnmGhv4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <CAEP_g=8nG6AHV9Y+5=48nPhkf5Oe=mG8EiyaKSqN4omnmGhv4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2014-01-14 9:46 ` Thomas Graf 0 siblings, 0 replies; 46+ messages in thread From: Thomas Graf @ 2014-01-14 9:46 UTC (permalink / raw) To: Jesse Gross, Zoltan Kiss Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, David Miller On 01/14/2014 02:30 AM, Jesse Gross wrote: >> And it works. I guess the last one causing the problem. Might be an >> important factor, I'm using 32 bit Dom0. > > I think you're probably right. Thomas - can you take a look? > > We shouldn't be doing any zerocopy in this situation but it looks to > me like we don't do any padding at all, even in situations where we > are copying the data. I'm looking into this now. The zerocopy method should only be attempted if user space has announced the ability to received unaligned messages. @Zoltan: I assume you are using an unmodified OVS 1.9.3? ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2013-11-02 7:43 Jesse Gross 2013-11-04 21:26 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2013-11-02 7:43 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA A set of updates for net-next/3.13. Major changes are: * Restructure flow handling code to be more logically organized and easier to read. * Rehashing of the flow table is moved from a workqueue to flow installation time. Before, heavy load could block the workqueue for excessive periods of time. * Additional debugging information is provided to help diagnose megaflows. * It's now possible to match on TCP flags. The following changes since commit 272b98c6455f00884f0350f775c5342358ebb73f: Linux 3.12-rc1 (2013-09-16 16:17:51 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 8ddd094675cfd453fc9838caa46ea108a4107183: openvswitch: Use flow hash during flow lookup operation. (2013-11-01 18:43:46 -0700) ---------------------------------------------------------------- Andy Zhou (1): openvswitch: collect mega flow mask stats Jarno Rajahalme (2): openvswitch: Widen TCP flags handling. openvswitch: TCP flags matching support. Pravin B Shelar (6): openvswitch: Move flow table rehashing to flow install. openvswitch: Restructure datapath.c and flow.c openvswitch: Move mega-flow list out of rehashing struct. openvswitch: Simplify mega-flow APIs. openvswitch: Enable all GSO features on internal port. openvswitch: Use flow hash during flow lookup operation. Wei Yongjun (2): openvswitch: remove duplicated include from vport-vxlan.c openvswitch: remove duplicated include from vport-gre.c include/uapi/linux/openvswitch.h | 18 +- net/openvswitch/Makefile | 2 + net/openvswitch/datapath.c | 668 ++------------ net/openvswitch/datapath.h | 9 +- net/openvswitch/flow.c | 1605 +-------------------------------- net/openvswitch/flow.h | 132 +-- net/openvswitch/flow_netlink.c | 1630 ++++++++++++++++++++++++++++++++++ net/openvswitch/flow_netlink.h | 60 ++ net/openvswitch/flow_table.c | 592 ++++++++++++ net/openvswitch/flow_table.h | 81 ++ net/openvswitch/vport-gre.c | 2 - net/openvswitch/vport-internal_dev.c | 2 +- net/openvswitch/vport-vxlan.c | 1 - 13 files changed, 2511 insertions(+), 2291 deletions(-) create mode 100644 net/openvswitch/flow_netlink.c create mode 100644 net/openvswitch/flow_netlink.h create mode 100644 net/openvswitch/flow_table.c create mode 100644 net/openvswitch/flow_table.h ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2013-11-02 7:43 Jesse Gross @ 2013-11-04 21:26 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2013-11-04 21:26 UTC (permalink / raw) To: jesse; +Cc: netdev, dev From: Jesse Gross <jesse@nicira.com> Date: Sat, 2 Nov 2013 00:43:39 -0700 > A set of updates for net-next/3.13. Major changes are: > * Restructure flow handling code to be more logically organized and > easier to read. > * Rehashing of the flow table is moved from a workqueue to flow > installation time. Before, heavy load could block the workqueue for > excessive periods of time. > * Additional debugging information is provided to help diagnose megaflows. > * It's now possible to match on TCP flags. Looks good, pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2013-10-30 0:22 Jesse Gross 0 siblings, 0 replies; 46+ messages in thread From: Jesse Gross @ 2013-10-30 0:22 UTC (permalink / raw) To: David Miller; +Cc: netdev, dev A set of updates for net-next/3.13. Major changes are: * Restructure flow handling code to be more logically organized and easier to read. * Previously flow state was effectively per-CPU but this is no longer true with the addition of wildcarded flows (megaflows). While good for flow setup rates, it is bad for stats updates. Stats are now per-CPU again to get the best of both worlds. * Rehashing of the flow table is moved from a workqueue to flow installation time. Before, heavy load could block the workqueue for excessive periods of time. * Additional debugging information is provided to help diagnose megaflows. * It's now possible to match on TCP flags. The following changes since commit 272b98c6455f00884f0350f775c5342358ebb73f: Linux 3.12-rc1 (2013-09-16 16:17:51 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to c7e8b9659587449b21ee68d7aee0cacadf650cce: openvswitch: TCP flags matching support. (2013-10-23 02:19:50 -0700) ---------------------------------------------------------------- Andy Zhou (1): openvswitch: collect mega flow mask stats Jarno Rajahalme (2): openvswitch: Widen TCP flags handling. openvswitch: TCP flags matching support. Pravin B Shelar (6): openvswitch: Move flow table rehashing to flow install. openvswitch: Restructure datapath.c and flow.c openvswitch: Move mega-flow list out of rehashing struct. openvswitch: Simplify mega-flow APIs. openvswitch: Per cpu flow stats. openvswitch: Enable all GSO features on internal port. Wei Yongjun (2): openvswitch: remove duplicated include from vport-vxlan.c openvswitch: remove duplicated include from vport-gre.c include/uapi/linux/openvswitch.h | 18 +- net/openvswitch/Makefile | 2 + net/openvswitch/datapath.c | 721 ++------------- net/openvswitch/datapath.h | 9 +- net/openvswitch/flow.c | 1631 ++-------------------------------- net/openvswitch/flow.h | 148 +-- net/openvswitch/flow_netlink.c | 1630 +++++++++++++++++++++++++++++++++ net/openvswitch/flow_netlink.h | 60 ++ net/openvswitch/flow_table.c | 600 +++++++++++++ net/openvswitch/flow_table.h | 81 ++ net/openvswitch/vport-gre.c | 2 - net/openvswitch/vport-internal_dev.c | 2 +- net/openvswitch/vport-vxlan.c | 1 - 13 files changed, 2590 insertions(+), 2315 deletions(-) create mode 100644 net/openvswitch/flow_netlink.c create mode 100644 net/openvswitch/flow_netlink.h create mode 100644 net/openvswitch/flow_table.c create mode 100644 net/openvswitch/flow_table.h ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2013-08-27 20:20 Jesse Gross 2013-08-28 2:11 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2013-08-27 20:20 UTC (permalink / raw) To: David Miller; +Cc: netdev, dev A number of significant new features and optimizations for net-next/3.12. Highlights are: * "Megaflows", an optimization that allows userspace to specify which flow fields were used to compute the results of the flow lookup. This allows for a major reduction in flow setups (the major performance bottleneck in Open vSwitch) without reducing flexibility. * Converting netlink dump operations to use RCU, allowing for additional parallelism in userspace. * Matching and modifying SCTP protocol fields. The following changes since commit 2771399ac9986c75437a83b1c723493cfcdfa439: fs_enet: cleanup clock API use (2013-08-22 22:13:54 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 5828cd9a68873df1340b420371c02c47647878fb: openvswitch: optimize flow compare and mask functions (2013-08-27 13:13:09 -0700) ---------------------------------------------------------------- Andy Zhou (3): openvswitch: Mega flow implementation openvswitch: Rename key_len to key_end openvswitch: optimize flow compare and mask functions Cong Wang (1): openvswitch: check CONFIG_OPENVSWITCH_GRE in makefile Jiri Pirko (1): openvswitch:: link upper device for port devices Joe Stringer (2): net: Add NEXTHDR_SCTP to ipv6.h openvswitch: Add SCTP support Justin Pettit (1): openvswitch: Fix argument descriptions in vport.c. Pravin B Shelar (3): openvswitch: Use RCU lock for flow dump operation. openvswitch: Use RCU lock for dp dump operation. openvswitch: Use non rcu hlist_del() flow table entry. Documentation/networking/openvswitch.txt | 40 + include/net/ipv6.h | 1 + include/uapi/linux/openvswitch.h | 15 +- net/openvswitch/Kconfig | 1 + net/openvswitch/Makefile | 5 +- net/openvswitch/actions.c | 45 +- net/openvswitch/datapath.c | 176 ++-- net/openvswitch/datapath.h | 6 + net/openvswitch/flow.c | 1485 +++++++++++++++++++++--------- net/openvswitch/flow.h | 89 +- net/openvswitch/vport-gre.c | 3 - net/openvswitch/vport-netdev.c | 20 +- net/openvswitch/vport.c | 3 +- 13 files changed, 1346 insertions(+), 543 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2013-08-27 20:20 Jesse Gross @ 2013-08-28 2:11 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2013-08-28 2:11 UTC (permalink / raw) To: jesse; +Cc: netdev, dev From: Jesse Gross <jesse@nicira.com> Date: Tue, 27 Aug 2013 13:20:37 -0700 > A number of significant new features and optimizations for net-next/3.12. > Highlights are: > * "Megaflows", an optimization that allows userspace to specify which > flow fields were used to compute the results of the flow lookup. > This allows for a major reduction in flow setups (the major > performance bottleneck in Open vSwitch) without reducing flexibility. > * Converting netlink dump operations to use RCU, allowing for > additional parallelism in userspace. > * Matching and modifying SCTP protocol fields. Pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2013-06-14 22:28 Jesse Gross 2013-06-14 22:34 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2013-06-14 22:28 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev A few miscellaneous improvements and cleanups before the GRE tunnel integration series. Intended for net-next/3.11. The following changes since commit f722406faae2d073cc1d01063d1123c35425939e: Linux 3.10-rc1 (2013-05-11 17:14:08 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 93d8fd1514b6862c3370ea92be3f3b4216e0bf8f: openvswitch: Simplify interface ovs_flow_metadata_from_nlattrs() (2013-06-14 15:09:12 -0700) ---------------------------------------------------------------- Andy Hill (1): openvswitch: Fix misspellings in comments and docs. Jesse Gross (2): openvswitch: Immediately exit on error in ovs_vport_cmd_set(). openvswitch: Remove unused get_config vport op. Lorand Jakab (1): openvswitch: fix variable names in comment Pravin B Shelar (4): openvswitch: Unify vport error stats handling. openvswitch: Fix struct comment. openvswitch: make skb->csum consistent with rest of networking stack. openvswitch: Simplify interface ovs_flow_metadata_from_nlattrs() include/uapi/linux/openvswitch.h | 1 - net/openvswitch/actions.c | 4 ++++ net/openvswitch/datapath.c | 17 ++++++++--------- net/openvswitch/flow.c | 29 +++++++++++++++-------------- net/openvswitch/flow.h | 4 ++-- net/openvswitch/vport-internal_dev.c | 1 + net/openvswitch/vport-netdev.c | 7 ++++--- net/openvswitch/vport-netdev.h | 1 - net/openvswitch/vport.c | 11 ++++++++--- net/openvswitch/vport.h | 13 +++++++++---- 10 files changed, 51 insertions(+), 37 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2013-06-14 22:28 Jesse Gross @ 2013-06-14 22:34 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2013-06-14 22:34 UTC (permalink / raw) To: jesse; +Cc: netdev, dev From: Jesse Gross <jesse@nicira.com> Date: Fri, 14 Jun 2013 15:28:49 -0700 > A few miscellaneous improvements and cleanups before the GRE tunnel > integration series. Intended for net-next/3.11. ... > git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master Pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2013-04-16 21:00 Jesse Gross 2013-04-17 17:31 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2013-04-16 21:00 UTC (permalink / raw) To: David Miller; +Cc: netdev, dev A number of improvements for net-next/3.10. Highlights include: * Properly exposing linux/openvswitch.h to userspace after the uapi changes. * Simplification of locking. It immediately makes things simpler to reason about and avoids holding RTNL mutex for longer than necessary. In the near future it will also enable tunnel registration and more fine-grained locking. * Miscellaneous cleanups and simplifications. The following changes since commit f498354793d57479d4e1b0f39969acd66737234c: qlcnic: Bump up the version to 5.2.39 (2013-03-29 15:51:06 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to e0f0ecf33c3f13401f90bff5afdc3ed1bb40b9af: openvswitch: Use generic struct pcpu_tstats. (2013-04-15 14:56:25 -0700) ---------------------------------------------------------------- Andy Zhou (1): openvswitch: datapath.h: Fix a stale comment. Pravin B Shelar (2): openvswitch: Simplify datapath locking. openvswitch: Use generic struct pcpu_tstats. Thomas Graf (7): openvswitch: Specify the minimal length of OVS_PACKET_ATTR_PACKET in the policy openvswitch: Use nla_memcpy() to memcpy() data from attributes openvswitch: Refine Netlink message size calculation and kill FLOW_BUFSIZE openvswitch: Move common genl notify code into ovs_notify() openvswitch: Use ETH_ALEN to define ethernet addresses openvswitch: Expose <linux/openvswitch.h> to userspace openvswitch: Don't insert empty OVS_VPORT_ATTR_OPTIONS attribute include/linux/openvswitch.h | 432 +------------------------------- include/uapi/linux/Kbuild | 1 + include/uapi/linux/openvswitch.h | 456 ++++++++++++++++++++++++++++++++++ net/openvswitch/datapath.c | 393 +++++++++++++++++------------ net/openvswitch/datapath.h | 70 ++++-- net/openvswitch/dp_notify.c | 82 ++++-- net/openvswitch/flow.c | 2 +- net/openvswitch/flow.h | 21 -- net/openvswitch/vport-internal_dev.c | 6 + net/openvswitch/vport-netdev.c | 8 +- net/openvswitch/vport.c | 58 +++-- net/openvswitch/vport.h | 15 +- 12 files changed, 849 insertions(+), 695 deletions(-) create mode 100644 include/uapi/linux/openvswitch.h ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2013-04-16 21:00 Jesse Gross @ 2013-04-17 17:31 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2013-04-17 17:31 UTC (permalink / raw) To: jesse; +Cc: netdev, dev From: Jesse Gross <jesse@nicira.com> Date: Tue, 16 Apr 2013 14:00:09 -0700 > A number of improvements for net-next/3.10. > > Highlights include: > * Properly exposing linux/openvswitch.h to userspace after the uapi changes. > * Simplification of locking. It immediately makes things simpler to reason about and avoids holding RTNL mutex for longer than necessary. In the near future it will also enable tunnel registration and more fine-grained locking. > * Miscellaneous cleanups and simplifications. Pulled, but please don't make your email look so silly with those 500 character long lines, make them not exceed 80 columns. Thanks. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2013-03-15 17:38 Jesse Gross 2013-03-17 16:59 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2013-03-15 17:38 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA A couple of minor enhancements for net-next/3.10. The largest is an extension to allow variable length metadata to be passed to userspace with packets. There is a merge conflict in net/openvswitch/vport-internal_dev.c: A existing commit modifies internal_dev_mac_addr() and a new commit deletes it. The new one is correct, so you can just remove that function. The following changes since commit a5a81f0b9025867efb999d14a8dfc1907c5a4c3b: ipv6: Fix default route failover when CONFIG_IPV6_ROUTER_PREF=n (2012-12-03 15:34:47 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 4490108b4a5ada14c7be712260829faecc814ae5: openvswitch: Allow OVS_USERSPACE_ATTR_USERDATA to be variable length. (2013-02-22 16:29:22 -0800) ---------------------------------------------------------------- Ben Pfaff (1): openvswitch: Allow OVS_USERSPACE_ATTR_USERDATA to be variable length. Jarno Rajahalme (2): linux/openvswitch.h: Make OVSP_LOCAL 32-bit. openvswitch: Change ENOENT return value to ENODEV in lookup_vport(). Thomas Graf (2): openvswitch: Use eth_mac_addr() instead of duplicating it openvswitch: Avoid useless holes in struct vport include/linux/openvswitch.h | 13 +++++++------ net/openvswitch/datapath.c | 13 +++++++------ net/openvswitch/datapath.h | 2 +- net/openvswitch/vport-internal_dev.c | 14 ++------------ net/openvswitch/vport.h | 4 ++-- 5 files changed, 19 insertions(+), 27 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2013-03-15 17:38 Jesse Gross @ 2013-03-17 16:59 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2013-03-17 16:59 UTC (permalink / raw) To: jesse; +Cc: netdev, dev From: Jesse Gross <jesse@nicira.com> Date: Fri, 15 Mar 2013 10:38:46 -0700 > A couple of minor enhancements for net-next/3.10. The largest is an > extension to allow variable length metadata to be passed to userspace > with packets. > > There is a merge conflict in net/openvswitch/vport-internal_dev.c: > A existing commit modifies internal_dev_mac_addr() and a new commit > deletes it. The new one is correct, so you can just remove that function. Pulled, thanks Jesse. Thanks, in particular, for the heads up about the merge conflict. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2012-11-29 18:35 Jesse Gross 2012-11-30 17:03 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2012-11-29 18:35 UTC (permalink / raw) To: David Miller; +Cc: netdev, dev This series of improvements for 3.8/net-next contains four components: * Support for modifying IPv6 headers * Support for matching and setting skb->mark for better integration with things like iptables * Ability to recognize the EtherType for RARP packets * Two small performance enhancements The movement of ipv6_find_hdr() into exthdrs_core.c causes two small merge conflicts. I left it as is but can do the merge if you want. The conflicts are: * ipv6_find_hdr() and ipv6_find_tlv() were both moved to the bottom of exthdrs_core.c. Both should stay. * A new use of ipv6_find_hdr() was added to net/netfilter/ipvs/ip_vs_core.c after this patch. The IPVS user has two instances of the old constant name IP6T_FH_F_FRAG which has been renamed to IP6_FH_F_FRAG. The following changes since commit d04d382980c86bdee9960c3eb157a73f8ed230cc: openvswitch: Store flow key len if ARP opcode is not request or reply. (2012-10-30 17:17:09 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to 92eb1d477145b2e7780b5002e856f70b8c3d74da: openvswitch: Use RCU callback when detaching netdevices. (2012-11-28 14:04:34 -0800) ---------------------------------------------------------------- Ansis Atteka (3): ipv6: improve ipv6_find_hdr() to skip empty routing headers openvswitch: add ipv6 'set' action openvswitch: add skb mark matching and set action Jesse Gross (2): ipv6: Move ipv6_find_hdr() out of Netfilter code. openvswitch: Use RCU callback when detaching netdevices. Mehak Mahajan (1): openvswitch: Process RARP packets with ethertype 0x8035 similar to ARP packets. Shan Wei (1): net: openvswitch: use this_cpu_ptr per-cpu helper include/linux/netfilter_ipv6/ip6_tables.h | 9 --- include/linux/openvswitch.h | 1 + include/net/ipv6.h | 10 +++ net/ipv6/exthdrs_core.c | 123 +++++++++++++++++++++++++++++ net/ipv6/netfilter/ip6_tables.c | 103 ------------------------ net/netfilter/xt_HMARK.c | 8 +- net/openvswitch/actions.c | 97 +++++++++++++++++++++++ net/openvswitch/datapath.c | 27 ++++++- net/openvswitch/flow.c | 28 ++++++- net/openvswitch/flow.h | 8 +- net/openvswitch/vport-netdev.c | 14 +++- net/openvswitch/vport-netdev.h | 3 + net/openvswitch/vport.c | 5 +- 13 files changed, 304 insertions(+), 132 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [GIT net-next] Open vSwitch 2012-11-29 18:35 Jesse Gross @ 2012-11-30 17:03 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2012-11-30 17:03 UTC (permalink / raw) To: jesse; +Cc: netdev, dev From: Jesse Gross <jesse@nicira.com> Date: Thu, 29 Nov 2012 10:35:42 -0800 > This series of improvements for 3.8/net-next contains four components: > * Support for modifying IPv6 headers > * Support for matching and setting skb->mark for better integration with > things like iptables > * Ability to recognize the EtherType for RARP packets > * Two small performance enhancements > > The movement of ipv6_find_hdr() into exthdrs_core.c causes two small merge > conflicts. I left it as is but can do the merge if you want. The conflicts > are: > * ipv6_find_hdr() and ipv6_find_tlv() were both moved to the bottom of > exthdrs_core.c. Both should stay. > * A new use of ipv6_find_hdr() was added to net/netfilter/ipvs/ip_vs_core.c > after this patch. The IPVS user has two instances of the old constant > name IP6T_FH_F_FRAG which has been renamed to IP6_FH_F_FRAG. Pulled, thanks Jesse. The merge conflict directions were particularly helpful. If you ever do the merge yourself (I'm ambivalent about where you or I do it), make sure you force the merge commit message to have a description of the conflict resolution similarly to what you provided here. Thanks again. ^ permalink raw reply [flat|nested] 46+ messages in thread
* [GIT net-next] Open vSwitch @ 2012-07-20 22:26 Jesse Gross [not found] ` <1342823210-3308-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 46+ messages in thread From: Jesse Gross @ 2012-07-20 22:26 UTC (permalink / raw) To: David Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA A few bug fixes and small enhancements for net-next/3.6. The following changes since commit bf32fecdc1851ad9ca960f56771b798d17c26cf1: openvswitch: Add length check when retrieving TCP flags. (2012-04-02 14:28:57 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master for you to fetch changes up to efaac3bf087b1a6cec28f2a041e01c874d65390c: openvswitch: Fix typo in documentation. (2012-07-20 14:51:07 -0700) ---------------------------------------------------------------- Ansis Atteka (1): openvswitch: Do not send notification if ovs_vport_set_options() failed Ben Pfaff (1): openvswitch: Check gso_type for correct sk_buff in queue_gso_packets(). Jesse Gross (2): openvswitch: Enable retrieval of TCP flags from IPv6 traffic. openvswitch: Reset upper layer protocol info on internal devices. Leo Alterman (1): openvswitch: Fix typo in documentation. Pravin B Shelar (1): openvswitch: Check currect return value from skb_gso_segment() Raju Subramanian (1): openvswitch: Replace Nicira Networks. Documentation/networking/openvswitch.txt | 2 +- net/openvswitch/actions.c | 2 +- net/openvswitch/datapath.c | 13 ++++++++----- net/openvswitch/datapath.h | 2 +- net/openvswitch/dp_notify.c | 2 +- net/openvswitch/flow.c | 5 +++-- net/openvswitch/flow.h | 2 +- net/openvswitch/vport-internal_dev.c | 10 +++++++++- net/openvswitch/vport-internal_dev.h | 2 +- net/openvswitch/vport-netdev.c | 2 +- net/openvswitch/vport-netdev.h | 2 +- net/openvswitch/vport.c | 2 +- net/openvswitch/vport.h | 2 +- 13 files changed, 30 insertions(+), 18 deletions(-) ^ permalink raw reply [flat|nested] 46+ messages in thread
[parent not found: <1342823210-3308-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>]
* Re: [GIT net-next] Open vSwitch [not found] ` <1342823210-3308-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> @ 2012-07-20 23:17 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2012-07-20 23:17 UTC (permalink / raw) To: jesse-l0M0P4e3n4LQT0dZR+AlfA Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> Date: Fri, 20 Jul 2012 15:26:43 -0700 > A few bug fixes and small enhancements for net-next/3.6. > > The following changes since commit bf32fecdc1851ad9ca960f56771b798d17c26cf1: > > openvswitch: Add length check when retrieving TCP flags. (2012-04-02 14:28:57 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git master Pulled, thanks Jesse. ^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2014-11-11 18:34 UTC | newest] Thread overview: 46+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-04 19:14 [GIT net-next] Open vSwitch Jesse Gross [not found] ` <1346786049-3100-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2012-09-04 19:14 ` [PATCH net-next 1/2] openvswitch: Add support for network namespaces Jesse Gross 2012-09-04 19:14 ` [PATCH net-next 2/2] openvswitch: Increase maximum number of datapath ports Jesse Gross 2012-09-04 19:26 ` [GIT net-next] Open vSwitch David Miller -- strict thread matches above, loose matches on Subject: below -- 2014-11-10 3:58 Pravin B Shelar 2014-11-11 18:34 ` David Miller 2014-11-04 6:00 Pravin B Shelar 2014-11-05 20:10 ` David Miller 2014-11-05 22:52 ` Pravin Shelar 2014-09-11 22:01 Pravin B Shelar 2014-09-11 23:09 ` Pravin Shelar 2014-07-31 23:57 Pravin B Shelar 2014-08-02 22:16 ` David Miller 2014-08-03 19:20 ` Pravin Shelar 2014-08-04 4:21 ` David Miller 2014-08-04 19:35 ` Pravin Shelar 2014-08-04 19:42 ` David Miller 2014-08-06 22:55 ` Alexei Starovoitov 2014-08-13 13:34 ` Nicolas Dichtel 2014-07-14 0:12 Pravin B Shelar 2014-05-20 8:59 Pravin B Shelar 2014-05-23 18:46 ` David Miller [not found] ` <20140523.144618.48288319728715940.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 2014-05-23 20:16 ` Pravin Shelar 2014-05-16 21:07 Jesse Gross [not found] ` <1400274459-56304-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2014-05-16 21:22 ` David Miller 2014-01-07 0:15 Jesse Gross [not found] ` <1389053776-62865-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2014-01-07 0:49 ` David Miller 2014-01-08 14:49 ` [ovs-dev] " Zoltan Kiss [not found] ` <52CD657F.7080806-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> 2014-01-08 15:10 ` Jesse Gross 2014-01-13 18:04 ` [ovs-dev] " Zoltan Kiss [not found] ` <52D42A9E.1030805-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> 2014-01-14 0:31 ` Zoltan Kiss [not found] ` <52D4857C.7020902-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org> 2014-01-14 1:30 ` Jesse Gross [not found] ` <CAEP_g=8nG6AHV9Y+5=48nPhkf5Oe=mG8EiyaKSqN4omnmGhv4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2014-01-14 9:46 ` Thomas Graf 2013-11-02 7:43 Jesse Gross 2013-11-04 21:26 ` David Miller 2013-10-30 0:22 Jesse Gross 2013-08-27 20:20 Jesse Gross 2013-08-28 2:11 ` David Miller 2013-06-14 22:28 Jesse Gross 2013-06-14 22:34 ` David Miller 2013-04-16 21:00 Jesse Gross 2013-04-17 17:31 ` David Miller 2013-03-15 17:38 Jesse Gross 2013-03-17 16:59 ` David Miller 2012-11-29 18:35 Jesse Gross 2012-11-30 17:03 ` David Miller 2012-07-20 22:26 Jesse Gross [not found] ` <1342823210-3308-1-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> 2012-07-20 23:17 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).