From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F129BC10F14 for ; Thu, 11 Apr 2019 09:04:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BFD722084D for ; Thu, 11 Apr 2019 09:04:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726699AbfDKJEe (ORCPT ); Thu, 11 Apr 2019 05:04:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48190 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726562AbfDKJEe (ORCPT ); Thu, 11 Apr 2019 05:04:34 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 760636FAB1; Thu, 11 Apr 2019 09:04:34 +0000 (UTC) Received: from [10.72.12.119] (ovpn-12-119.pek2.redhat.com [10.72.12.119]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3B2581001E69; Thu, 11 Apr 2019 09:04:26 +0000 (UTC) Subject: Re: Tun congestion/BQL To: David Woodhouse , =?UTF-8?Q?Toke_H=c3=b8iland-J=c3=b8rgensen?= , netdev@vger.kernel.org Cc: "Michael S. Tsirkin" References: <2e310fc6ee847d20dd23692fd1db733e607602f5.camel@infradead.org> <1506fcbbfb7ab7a1e448b7b6cbf45f703bfcc80f.camel@infradead.org> <8c64c80d-165c-076b-fca3-5374edc87853@redhat.com> <87ftqqugbj.fsf@toke.dk> <123cfccb766a6f55312d6a477764d3e7b88ad221.camel@infradead.org> <79f9e78d6f653a4a4ccd2fad76d8c39622491172.camel@infradead.org> From: Jason Wang Message-ID: Date: Thu, 11 Apr 2019 17:04:25 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <79f9e78d6f653a4a4ccd2fad76d8c39622491172.camel@infradead.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 11 Apr 2019 09:04:34 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 2019/4/11 下午4:56, David Woodhouse wrote: > On Thu, 2019-04-11 at 15:17 +0800, Jason Wang wrote: >>>> Ideally we want to react when the queue starts building rather than when >>>> it starts getting full; by pushing back on upper layers (or, if >>>> forwarding, dropping packets to signal congestion). >>> This is precisely what my first accidental if (!ptr_ring_empty()) >>> variant was doing, right? :) >> >> But I give a try on your ptr_ring_full() patch on VM, looks like it >> works (single flow), no packets were dropped by TAP anymore. How many >> flows did you use? > Hm, I thought I was only using one. This is just a simple case of > userspace opening /dev/net/tun, TUNSETIFF, and reading/writing. > > But if I was stopping the *wrong* queue that might explain things. Btw, forget to mention, I modify your patch to use netif_stop/wake_subqueue() instead. Thanks > > This is a persistent tun device. > >>>> In practice, this means tuning the TX ring to the *minimum* size it can >>>> be without starving (this is basically what BQL does for Ethernet), and >>>> keeping packets queued in the qdisc layer instead, where it can be >>>> managed... >>> I was going to add BQL (as $SUBJECT may have caused you to infer) but >>> trivially adding the netdev_sent_queue() in tun_net_xmit() and >>> netdev_completed_queue() for xdp vs. skb in tun_do_read() was tripping >>> the BUG in dql_completed(). >> >> Something like https://lists.openwall.net/netdev/2012/11/12/6767 ? > Fairly much. > > Except again I was being lazy for the proof-of-concept, ignoring 'txq' > and just using netdev_sent_queue() etc. >