From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Brivio Subject: Re: [PATCH RFC net-next] openvswitch: Queue upcalls to userspace in per-port round-robin order Date: Sat, 4 Aug 2018 02:43:24 +0200 Message-ID: <20180804024324.4d900b5e@epycfail> References: <20180704142342.21740-1-mcroce@redhat.com> <20180731220657.GC29662@ovn.org> <20180803185241.4ac0d1e5@epycfail> <20180803230108.GU29662@ovn.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Matteo Croce , Pravin B Shelar , jpettit@vmware.com, gvrose8192@gmail.com, netdev , dev@openvswitch.org, Jiri Benc , Aaron Conole To: Ben Pfaff Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48218 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732001AbeHDCmJ (ORCPT ); Fri, 3 Aug 2018 22:42:09 -0400 In-Reply-To: <20180803230108.GU29662@ovn.org> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, 3 Aug 2018 16:01:08 -0700 Ben Pfaff wrote: > I think that a simple mechanism for fairness is fine. The direction > of extensibility that makes me anxious is how to decide what matters > for fairness. So far, we've talked about per-vport fairness. That > works pretty well for packets coming in from virtual interfaces where > each vport represents a separate VM. Yes, right, that's the case where we have significant issues currently. > It does not work well if the traffic filling your queues all comes > from a single physical port because some source of traffic is sending > traffic at a high rate. In that case, you'll do a lot better if you > do fairness based on the source 5-tuple. But if you're doing network > virtualization, then the outer source 5-tuples won't necessarily vary > much and you'd be better off looking at the VNI and maybe some Geneve > TLV options and maybe the inner 5-tuple... Sure, I see what you mean now. That looks entirely doable if we abstract the round-robin bucket selection out of the current patch. > I would be very pleased if we could integrate a simple mechanism for > fairness, based for now on some simple criteria like the source port, > but thinking ahead to how we could later make it gracefully extensible > to consider more general and possibly customizable criteria. We could change the patch so that instead of just using the vport for round-robin queue insertion, we generalise that and use "buckets" instead of vports, and have a set of possible functions that are called instead of using port_no directly in ovs_dp_upcall_queue_roundrobin(), making this configurable via netlink, per datapath. We could implement selection based on source port or a hash on the source 5-tuple, and the relevant bits of ovs_dp_upcall_queue_roundrobin() would look like this: static int ovs_dp_upcall_queue_roundrobin(struct datapath *dp, struct dp_upcall_info *upcall) { [...] list_for_each_entry(pos, head, list) { int bucket = dp->rr_select(pos); /* Count per-bucket upcalls. */ if (dp->upcalls.count[bucket] == U8_MAX) { err = -ENOSPC; goto out_clear; } dp->upcalls.count[bucket]++; if (bucket == upcall->bucket) { /* Another upcall for the same bucket: move insertion * point here, keep looking for insertion condition to * be still met further on. */ find_next = true; here = pos; continue; } count = dp->upcalls.count[bucket]; if (find_next && dp->upcalls.count[bucket] >= count) { /* Insertion condition met: no need to look further, * unless another upcall for the same port occurs later. */ find_next = false; here = pos; } } [...] } and implementations for dp->rr_select() would look like: int rr_select_vport(struct dp_upcall_info *upcall) { return upcall->port_no; } int rr_select_srcport(struct dp_upcall_info *upcall) { /* look up source port from upcall->skb... */ } And we could then easily extend this to use BPF with maps one day. This is for clarity by the way, but I guess we should avoid indirect calls in the final implementation. What do you think? -- Stefano