From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH RFC 1/2] virtio-net: bql support Date: Sun, 6 Jan 2019 23:01:23 -0500 Message-ID: <20190106225951-mutt-send-email-mst@kernel.org> References: <20181205225323.12555-2-mst@redhat.com> <21384cb5-99a6-7431-1039-b356521e1bc3@redhat.com> <20181226101528-mutt-send-email-mst@kernel.org> <0fa99d9b-e510-d7eb-db1b-831bd7610ce9@redhat.com> <20181230134106-mutt-send-email-mst@kernel.org> <20190102085457-mutt-send-email-mst@kernel.org> <17d2ab21-1c9a-2bb9-166f-2863d019cb0b@redhat.com> <20190106221506-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Jason Wang Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, maxime.coquelin@redhat.com, wexu@redhat.com, "David S. Miller" List-Id: virtualization@lists.linuxfoundation.org T24gTW9uLCBKYW4gMDcsIDIwMTkgYXQgMTE6NTE6NTVBTSArMDgwMCwgSmFzb24gV2FuZyB3cm90 ZToKPiAKPiBPbiAyMDE5LzEvNyDkuIrljYgxMToxNywgTWljaGFlbCBTLiBUc2lya2luIHdyb3Rl Ogo+ID4gT24gTW9uLCBKYW4gMDcsIDIwMTkgYXQgMTA6MTQ6MzdBTSArMDgwMCwgSmFzb24gV2Fu ZyB3cm90ZToKPiA+ID4gT24gMjAxOS8xLzIg5LiL5Y2IOTo1OSwgTWljaGFlbCBTLiBUc2lya2lu IHdyb3RlOgo+ID4gPiA+IE9uIFdlZCwgSmFuIDAyLCAyMDE5IGF0IDExOjI4OjQzQU0gKzA4MDAs IEphc29uIFdhbmcgd3JvdGU6Cj4gPiA+ID4gPiBPbiAyMDE4LzEyLzMxIOS4iuWNiDI6NDUsIE1p Y2hhZWwgUy4gVHNpcmtpbiB3cm90ZToKPiA+ID4gPiA+ID4gT24gVGh1LCBEZWMgMjcsIDIwMTgg YXQgMDY6MDA6MzZQTSArMDgwMCwgSmFzb24gV2FuZyB3cm90ZToKPiA+ID4gPiA+ID4gPiBPbiAy MDE4LzEyLzI2IOS4i+WNiDExOjE5LCBNaWNoYWVsIFMuIFRzaXJraW4gd3JvdGU6Cj4gPiA+ID4g PiA+ID4gPiBPbiBUaHUsIERlYyAwNiwgMjAxOCBhdCAwNDoxNzozNlBNICswODAwLCBKYXNvbiBX YW5nIHdyb3RlOgo+ID4gPiA+ID4gPiA+ID4gPiBPbiAyMDE4LzEyLzYg5LiK5Y2INjo1NCwgTWlj aGFlbCBTLiBUc2lya2luIHdyb3RlOgo+ID4gPiA+ID4gPiA+ID4gPiA+IFdoZW4gdXNlX25hcGkg aXMgc2V0LCBsZXQncyBlbmFibGUgQlFMcy4gIE5vdGU6IHNvbWUgb2YgdGhlIGlzc3VlcyBhcmUK PiA+ID4gPiA+ID4gPiA+ID4gPiBzaW1pbGFyIHRvIHdpZmkuICBJdCdzIHdvcnRoIGNvbnNpZGVy aW5nIHdoZXRoZXIgc29tZXRoaW5nIHNpbWlsYXIgdG8KPiA+ID4gPiA+ID4gPiA+ID4gPiBjb21t aXQgMzYxNDhjMmJiZmJlICgibWFjODAyMTE6IEFkanVzdCBUU1EgcGFjaW5nIHNoaWZ0IikgbWln aHQgYmUKPiA+ID4gPiA+ID4gPiA+ID4gPiBiZW5lZml0aWFsLgo+ID4gPiA+ID4gPiA+ID4gPiBJ J3ZlIHBsYXllZCBhIHNpbWlsYXIgcGF0Y2ggc2V2ZXJhbCBkYXlzIGJlZm9yZS4gVGhlIHRyaWNr eSBwYXJ0IGlzIHRoZSBtb2RlCj4gPiA+ID4gPiA+ID4gPiA+IHN3aXRjaGluZyBiZXR3ZWVuIG5h cGkgYW5kIG5vIG5hcGkuIFdlIHNob3VsZCBtYWtlIHN1cmUgd2hlbiB0aGUgcGFja2V0IGlzCj4g PiA+ID4gPiA+ID4gPiA+IHNlbnQgYW5kIHRyYWtjZWQgYnkgQlFMLMKgIGl0IHNob3VsZCBiZSBj b25zdW1lZCBieSBCUUwgYXMgd2VsbC4gSSBkaWQgaXQgYnkKPiA+ID4gPiA+ID4gPiA+ID4gdHJh Y2tpbmcgaXQgdGhyb3VnaCBza2ItPmNiLsKgIEFuZCBkZWFsIHdpdGggdGhlIGZyZWV6ZSBieSBy ZXNldCB0aGUgQlFMCj4gPiA+ID4gPiA+ID4gPiA+IHN0YXR1cy4gUGF0Y2ggYXR0YWNoZWQuCj4g PiA+ID4gPiA+ID4gPiA+IAo+ID4gPiA+ID4gPiA+ID4gPiBCdXQgd2hlbiB0ZXN0aW5nIHdpdGgg dmhvc3QtbmV0LCBJIGRvbid0IHZlcnkgYSBzdGFibGUgcGVyZm9ybWFuY2UsCj4gPiA+ID4gPiA+ ID4gPiBTbyBob3cgYWJvdXQgaW5jcmVhc2luZyBUU1EgcGFjaW5nIHNoaWZ0IHRoZW4/Cj4gPiA+ ID4gPiA+ID4gSSBjYW4gdGVzdCB0aGlzLiBCdXQgY2hhbmdpbmcgZGVmYXVsdCBUQ1AgdmFsdWUg aXMgbXVjaCBtb3JlIHRoYW4gYQo+ID4gPiA+ID4gPiA+IHZpcnRpby1uZXQgc3BlY2lmaWMgdGhp bmcuCj4gPiA+ID4gPiA+IFdlbGwgc2FtZSBsb2dpYyBhcyB3aWZpIGFwcGxpZXMuIFVucHJlZGlj dGFibGUgbGF0ZW5jaWVzIHJlbGF0ZWQKPiA+ID4gPiA+ID4gdG8gcmFkaW8gaW4gb25lIGNhc2Us IHRvIGhvc3Qgc2NoZWR1bGVyIGluIHRoZSBvdGhlci4KPiA+ID4gPiA+ID4gCj4gPiA+ID4gPiA+ ID4gPiA+IGl0IHdhcwo+ID4gPiA+ID4gPiA+ID4gPiBwcm9iYWJseSBiZWNhdXNlIHdlIGJhdGNo IHRoZSB1c2VkIHJpbmcgdXBkYXRpbmcgc28gdHggaW50ZXJydXB0IG1heSBjb21lCj4gPiA+ID4g PiA+ID4gPiA+IHJhbmRvbWx5LiBXZSBwcm9iYWJseSBuZWVkIHRvIGltcGxlbWVudCB0aW1lIGJv dW5kZWQgY29hbGVzY2luZyBtZWNoYW5pc20KPiA+ID4gPiA+ID4gPiA+ID4gd2hpY2ggY291bGQg YmUgY29uZmlndXJlZCBmcm9tIHVzZXJzcGFjZS4KPiA+ID4gPiA+ID4gPiA+IEkgZG9uJ3QgdGhp bmsgaXQncyByZWFzb25hYmxlIHRvIGV4cGVjdCB1c2Vyc3BhY2UgdG8gYmUgdGhhdCBzbWFydCAu Li4KPiA+ID4gPiA+ID4gPiA+IFdoeSBkbyB3ZSBuZWVkIHRpbWUgYm91bmRlZD8gdXNlZCByaW5n IGlzIGFsd2F5cyB1cGRhdGVkIHdoZW4gcmluZwo+ID4gPiA+ID4gPiA+ID4gYmVjb21lcyBlbXB0 eS4KPiA+ID4gPiA+ID4gPiBXZSBkb24ndCBhZGQgdXNlZCB3aGVuIG1lYW5zIEJRTCBtYXkgbm90 IHNlZSB0aGUgY29uc3VtZWQgcGFja2V0IGluIHRpbWUuCj4gPiA+ID4gPiA+ID4gQW5kIHRoZSBk ZWxheSB2YXJpZXMgYmFzZWQgb24gdGhlIHdvcmtsb2FkIHNpbmNlIHdlIGNvdW50IHBhY2tldHMg bm90IGJ5dGVzCj4gPiA+ID4gPiA+ID4gb3IgdGltZSBiZWZvcmUgZG9pbmcgdGhlIGJhdGNoZWQg dXBkYXRpbmcuCj4gPiA+ID4gPiA+ID4gCj4gPiA+ID4gPiA+ID4gVGhhbmtzCj4gPiA+ID4gPiA+ IFNvcnJ5IEkgc3RpbGwgZG9uJ3QgZ2V0IGl0Lgo+ID4gPiA+ID4gPiBXaGVuIG5vdGhpbmcgaXMg b3V0c3RhbmRpbmcgdGhlbiB3ZSBkbyB1cGRhdGUgdGhlIHVzZWQuCj4gPiA+ID4gPiA+IFNvIGlm IEJRTCBzdG9wcyB1c2Vyc3BhY2UgZnJvbSBzZW5kaW5nIHBhY2tldHMgdGhlbgo+ID4gPiA+ID4g PiB3ZSBnZXQgYW4gaW50ZXJydXB0IGFuZCBwYWNrZXRzIHN0YXJ0IGZsb3dpbmcgYWdhaW4uCj4g PiA+ID4gPiBZZXMsIGJ1dCBob3cgYWJvdXQgdGhlIGNhc2VzIG9mIG11bHRpcGxlIGZsb3dzLiBU aGF0J3Mgd2hlcmUgSSBzZWUgdW5zdGFibGUKPiA+ID4gPiA+IHJlc3VsdHMuCj4gPiA+ID4gPiAK PiA+ID4gPiA+IAo+ID4gPiA+ID4gPiBJdCBtaWdodCBiZSBzdWJvcHRpbWFsLCB3ZSBtaWdodCBu ZWVkIHRvIHR1bmUgaXQgYnV0IEkgZG91YnQgcnVubmluZwo+ID4gPiA+ID4gPiB0aW1lcnMgaXMg YSBzb2x1dGlvbiwgdGltZXIgaW50ZXJydXB0cyBjYXVzZSBWTSBleGl0cy4KPiA+ID4gPiA+IFBy b2JhYmx5IG5vdCBhIHRpbWVyIGJ1dCBhIHRpbWUgY291bnRlciAob3IgZXZlbnQgYnl0ZSBjb3Vu dGVyKSBpbiB2aG9zdCB0bwo+ID4gPiA+ID4gYWRkIHVzZWQgYW5kIHNpZ25hbCBndWVzdCBpZiBp dCBleGNlZWRzIGEgdmFsdWUgaW5zdGVhZCBvZiB3YWl0aW5nIHRoZQo+ID4gPiA+ID4gbnVtYmVy IG9mIHBhY2tldHMuCj4gPiA+ID4gPiAKPiA+ID4gPiA+IAo+ID4gPiA+ID4gVGhhbmtzCj4gPiA+ ID4gV2VsbCB3ZSBhbHJlYWR5IGhhdmUgVkhPU1RfTkVUX1dFSUdIVCAtIGlzIGl0IHRvbyBiaWcg dGhlbj8KPiA+ID4gCj4gPiA+IEknbSBub3Qgc3VyZSwgaXQgbWlnaHQgYmUgdG9vIGJpZy4KPiA+ ID4gCj4gPiA+IAo+ID4gPiA+IEFuZCBtYXliZSB3ZSBzaG91bGQgZXhwb3NlIHRoZSAiTU9SRSIg ZmxhZyBpbiB0aGUgZGVzY3JpcHRvciAtCj4gPiA+ID4gZG8geW91IHRoaW5rIHRoYXQgd2lsbCBo ZWxwPwo+ID4gPiA+IAo+ID4gPiBJIGRvbid0IGtub3cuIEJ1dCBob3cgYSAibW9yZSIgZmxhZyBj YW4gaGVscCBoZXJlPwo+ID4gPiAKPiA+ID4gVGhhbmtzCj4gPiBJdCBzb3VuZHMgbGlrZSB3ZSBz aG91bGQgYmUgYSBiaXQgbW9yZSBhZ2dyZXNzaXZlIGluIHVwZGF0aW5nIHVzZWQgcmluZy4KPiA+ IEJ1dCBpZiB3ZSBqdXN0IGRvIGl0IG5haXZlbHkgd2Ugd2lsbCBoYXJtIHBlcmZvcm1hbmNlIGZv ciBzdXJlIGFzIHRoYXQKPiA+IGlzIGhvdyB3ZSBhcmUgZG9pbmcgYmF0Y2hpbmcgcmlnaHQgbm93 Lgo+IAo+IAo+IEkgYWdyZWUgYnV0IHRoZSBwcm9ibGVtIGlzIHRvIGJhbGFuY2UgdGhlIFBQUyBh bmQgdGhyb3VnaHB1dC4gTW9yZSBiYXRjaGluZwo+IGhlbHBzIGZvciBQUFMgYnV0IG1heSBkYW1h Z2UgVENQIHRocm91Z2hwdXQuCgpUaGF0IGlzIHdoYXQgbW9yZSBmbGFnIGlzIHN1cHBvc2VkIHRv IGJlIEkgdGhpbmsgLSBpdCBpcyBvbmx5IHNldCBpZgp0aGVyZSdzIGEgc29ja2V0IHRoYXQgYWN0 dWFsbHkgbmVlZHMgdGhlIHNrYiBmcmVlZCBpbiBvcmRlciB0byBnbyBvbi4KCj4gCj4gPiAgIElu c3RlYWQgd2UgY291bGQgbWFrZSBndWVzdAo+ID4gY29udHJvbCBiYXRjaGluZyB1c2luZyB0aGUg bW9yZSBmbGFnIC0gaWYgdGhhdCdzIG5vdCBzZXQgd2Ugd3JpdGUgb3V0Cj4gPiB0aGUgdXNlZCBy aW5nLgo+IAo+IAo+IEl0J3MgdW5kZXIgdGhlIGNvbnRyb2wgb2YgZ3Vlc3QsIHNvIEknbSBhZnJh aWQgd2Ugc3RpbGwgbmVlZCBzb21lIG1vcmUgZ3VhcmQKPiAoZS5nIHRpbWUvYnl0ZXMgY291bnRl cnMpIG9uIGhvc3QuCj4gCj4gVGhhbmtzCgpQb2ludCBpcyBpZiBndWVzdCBkb2VzIG5vdCBjYXJl IGFib3V0IHRoZSBza2IgYmVpbmcgZnJlZWQsIHRoZW4gdGhlcmUgaXMgbm8KcnVzaCBob3N0IHNp ZGUgdG8gbWFyayBidWZmZXIgdXNlZC4KCgo+IAo+ID4gCl9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fClZpcnR1YWxpemF0aW9uIG1haWxpbmcgbGlzdApWaXJ0 dWFsaXphdGlvbkBsaXN0cy5saW51eC1mb3VuZGF0aW9uLm9yZwpodHRwczovL2xpc3RzLmxpbnV4 Zm91bmRhdGlvbi5vcmcvbWFpbG1hbi9saXN0aW5mby92aXJ0dWFsaXphdGlvbg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7BA4C43387 for ; Mon, 7 Jan 2019 04:01:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9ED1C20859 for ; Mon, 7 Jan 2019 04:01:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726509AbfAGEBb (ORCPT ); Sun, 6 Jan 2019 23:01:31 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45300 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726246AbfAGEBb (ORCPT ); Sun, 6 Jan 2019 23:01:31 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D775A821CC; Mon, 7 Jan 2019 04:01:30 +0000 (UTC) Received: from redhat.com (ovpn-120-33.rdu2.redhat.com [10.10.120.33]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4409760C44; Mon, 7 Jan 2019 04:01:24 +0000 (UTC) Date: Sun, 6 Jan 2019 23:01:23 -0500 From: "Michael S. Tsirkin" To: Jason Wang Cc: linux-kernel@vger.kernel.org, maxime.coquelin@redhat.com, tiwei.bie@intel.com, wexu@redhat.com, jfreimann@redhat.com, "David S. Miller" , virtualization@lists.linux-foundation.org, netdev@vger.kernel.org Subject: Re: [PATCH RFC 1/2] virtio-net: bql support Message-ID: <20190106225951-mutt-send-email-mst@kernel.org> References: <20181205225323.12555-2-mst@redhat.com> <21384cb5-99a6-7431-1039-b356521e1bc3@redhat.com> <20181226101528-mutt-send-email-mst@kernel.org> <0fa99d9b-e510-d7eb-db1b-831bd7610ce9@redhat.com> <20181230134106-mutt-send-email-mst@kernel.org> <20190102085457-mutt-send-email-mst@kernel.org> <17d2ab21-1c9a-2bb9-166f-2863d019cb0b@redhat.com> <20190106221506-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 07 Jan 2019 04:01:30 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 07, 2019 at 11:51:55AM +0800, Jason Wang wrote: > > On 2019/1/7 上午11:17, Michael S. Tsirkin wrote: > > On Mon, Jan 07, 2019 at 10:14:37AM +0800, Jason Wang wrote: > > > On 2019/1/2 下午9:59, Michael S. Tsirkin wrote: > > > > On Wed, Jan 02, 2019 at 11:28:43AM +0800, Jason Wang wrote: > > > > > On 2018/12/31 上午2:45, Michael S. Tsirkin wrote: > > > > > > On Thu, Dec 27, 2018 at 06:00:36PM +0800, Jason Wang wrote: > > > > > > > On 2018/12/26 下午11:19, Michael S. Tsirkin wrote: > > > > > > > > On Thu, Dec 06, 2018 at 04:17:36PM +0800, Jason Wang wrote: > > > > > > > > > On 2018/12/6 上午6:54, Michael S. Tsirkin wrote: > > > > > > > > > > When use_napi is set, let's enable BQLs. Note: some of the issues are > > > > > > > > > > similar to wifi. It's worth considering whether something similar to > > > > > > > > > > commit 36148c2bbfbe ("mac80211: Adjust TSQ pacing shift") might be > > > > > > > > > > benefitial. > > > > > > > > > I've played a similar patch several days before. The tricky part is the mode > > > > > > > > > switching between napi and no napi. We should make sure when the packet is > > > > > > > > > sent and trakced by BQL,  it should be consumed by BQL as well. I did it by > > > > > > > > > tracking it through skb->cb.  And deal with the freeze by reset the BQL > > > > > > > > > status. Patch attached. > > > > > > > > > > > > > > > > > > But when testing with vhost-net, I don't very a stable performance, > > > > > > > > So how about increasing TSQ pacing shift then? > > > > > > > I can test this. But changing default TCP value is much more than a > > > > > > > virtio-net specific thing. > > > > > > Well same logic as wifi applies. Unpredictable latencies related > > > > > > to radio in one case, to host scheduler in the other. > > > > > > > > > > > > > > > it was > > > > > > > > > probably because we batch the used ring updating so tx interrupt may come > > > > > > > > > randomly. We probably need to implement time bounded coalescing mechanism > > > > > > > > > which could be configured from userspace. > > > > > > > > I don't think it's reasonable to expect userspace to be that smart ... > > > > > > > > Why do we need time bounded? used ring is always updated when ring > > > > > > > > becomes empty. > > > > > > > We don't add used when means BQL may not see the consumed packet in time. > > > > > > > And the delay varies based on the workload since we count packets not bytes > > > > > > > or time before doing the batched updating. > > > > > > > > > > > > > > Thanks > > > > > > Sorry I still don't get it. > > > > > > When nothing is outstanding then we do update the used. > > > > > > So if BQL stops userspace from sending packets then > > > > > > we get an interrupt and packets start flowing again. > > > > > Yes, but how about the cases of multiple flows. That's where I see unstable > > > > > results. > > > > > > > > > > > > > > > > It might be suboptimal, we might need to tune it but I doubt running > > > > > > timers is a solution, timer interrupts cause VM exits. > > > > > Probably not a timer but a time counter (or event byte counter) in vhost to > > > > > add used and signal guest if it exceeds a value instead of waiting the > > > > > number of packets. > > > > > > > > > > > > > > > Thanks > > > > Well we already have VHOST_NET_WEIGHT - is it too big then? > > > > > > I'm not sure, it might be too big. > > > > > > > > > > And maybe we should expose the "MORE" flag in the descriptor - > > > > do you think that will help? > > > > > > > I don't know. But how a "more" flag can help here? > > > > > > Thanks > > It sounds like we should be a bit more aggressive in updating used ring. > > But if we just do it naively we will harm performance for sure as that > > is how we are doing batching right now. > > > I agree but the problem is to balance the PPS and throughput. More batching > helps for PPS but may damage TCP throughput. That is what more flag is supposed to be I think - it is only set if there's a socket that actually needs the skb freed in order to go on. > > > Instead we could make guest > > control batching using the more flag - if that's not set we write out > > the used ring. > > > It's under the control of guest, so I'm afraid we still need some more guard > (e.g time/bytes counters) on host. > > Thanks Point is if guest does not care about the skb being freed, then there is no rush host side to mark buffer used. > > >