From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161729AbcEaDXR (ORCPT ); Mon, 30 May 2016 23:23:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59408 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932469AbcEaDXP (ORCPT ); Mon, 30 May 2016 23:23:15 -0400 Subject: Re: [PATCH V2 2/2] vhost_net: conditionally enable tx polling To: "Michael S. Tsirkin" References: <1464590874-39539-1-git-send-email-jasowang@redhat.com> <1464590874-39539-3-git-send-email-jasowang@redhat.com> <20160530155521.GA5427@redhat.com> Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org From: Jason Wang Message-ID: <574D039E.5020101@redhat.com> Date: Tue, 31 May 2016 11:23:10 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <20160530155521.GA5427@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 31 May 2016 03:23:14 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2016年05月30日 23:55, Michael S. Tsirkin wrote: > On Mon, May 30, 2016 at 02:47:54AM -0400, Jason Wang wrote: >> We always poll tx for socket, this is sub optimal since: >> >> - it will be only used when we exceed the sndbuf of the socket. >> - since we use two independent polls for tx and vq, this will slightly >> increase the waitqueue traversing time and more important, vhost >> could not benefit from commit >> 9e641bdcfa4ef4d6e2fbaa59c1be0ad5d1551fd5 ("net-tun: restructure >> tun_do_read for better sleep/wakeup efficiency") even if we've >> stopped rx polling during handle_rx since tx poll were still left in >> the waitqueue. > Why is this an issue? > sock_def_write_space only wakes up when queue is half empty, > not on each packet. > if ((atomic_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf) > > I suspect the issue is with your previous patch, > it now pokes at the spinlock on data path > where it used not to. > > Is that right? The problem is not tx wake up but still rx wake up. Patch 1 removes rx poll, but still left tx poll. So in sock_def_readable(), skwq_has_sleeper() returns true, we still need to traverse waitqueue and touch spinlocks. With this patch, unless a heavy tx load, tx poll were disabled, sock_def_readable() can return finish very soon. > > >> Fix this by conditionally enable tx polling only when -EAGAIN were >> met. >> >> Test shows about 8% improvement on guest rx pps. >> >> Before: ~1350000 >> After: ~1460000 >> >> Signed-off-by: Jason Wang >> --- >> drivers/vhost/net.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c >> index e91603b..5a05fa0 100644 >> --- a/drivers/vhost/net.c >> +++ b/drivers/vhost/net.c >> @@ -378,6 +378,7 @@ static void handle_tx(struct vhost_net *net) >> goto out; >> >> vhost_disable_notify(&net->dev, vq); >> + vhost_net_disable_vq(net, vq); >> >> hdr_size = nvq->vhost_hlen; >> zcopy = nvq->ubufs; >> @@ -459,6 +460,8 @@ static void handle_tx(struct vhost_net *net) >> % UIO_MAXIOV; >> } >> vhost_discard_vq_desc(vq, 1); >> + if (err == -EAGAIN) >> + vhost_net_enable_vq(net, vq); >> break; >> } >> if (err != len) >> -- >> 1.8.3.1