From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH RFC 1/2] virtio-net: bql support Date: Wed, 26 Dec 2018 10:22:30 -0500 Message-ID: <20181226102100-mutt-send-email-mst@kernel.org> References: <20181205225323.12555-1-mst@redhat.com> <20181205225323.12555-2-mst@redhat.com> <21384cb5-99a6-7431-1039-b356521e1bc3@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: linux-kernel@vger.kernel.org, maxime.coquelin@redhat.com, tiwei.bie@intel.com, wexu@redhat.com, jfreimann@redhat.com, "David S. Miller" , virtualization@lists.linux-foundation.org, netdev@vger.kernel.org To: Jason Wang Return-path: Content-Disposition: inline In-Reply-To: <21384cb5-99a6-7431-1039-b356521e1bc3@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Thu, Dec 06, 2018 at 04:17:36PM +0800, Jason Wang wrote: > > On 2018/12/6 上午6:54, Michael S. Tsirkin wrote: > > When use_napi is set, let's enable BQLs. Note: some of the issues are > > similar to wifi. It's worth considering whether something similar to > > commit 36148c2bbfbe ("mac80211: Adjust TSQ pacing shift") might be > > benefitial. > > > I've played a similar patch several days before. The tricky part is the mode > switching between napi and no napi. We should make sure when the packet is > sent and trakced by BQL,  it should be consumed by BQL as well. I just went over the patch again and I don't understand this comment. This patch only enabled BQL with tx napi. Thus there's no mode switching. What did I miss? > I did it by > tracking it through skb->cb.  And deal with the freeze by reset the BQL > status. Patch attached. > > But when testing with vhost-net, I don't very a stable performance, it was > probably because we batch the used ring updating so tx interrupt may come > randomly. We probably need to implement time bounded coalescing mechanism > which could be configured from userspace. > > Btw, maybe it's time just enable napi TX by default. I get ~10% TCP_RR > regression on machine without APICv, (haven't found time to test APICv > machine). But consider it was for correctness, I think it's acceptable? Then > we can do optimization on top? > > > Thanks > > > > Signed-off-by: Michael S. Tsirkin > > --- > > drivers/net/virtio_net.c | 27 +++++++++++++++++++-------- > > 1 file changed, 19 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index cecfd77c9f3c..b657bde6b94b 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -1325,7 +1325,8 @@ static int virtnet_receive(struct receive_queue *rq, int budget, > > return stats.packets; > > } > > -static void free_old_xmit_skbs(struct send_queue *sq) > > +static void free_old_xmit_skbs(struct send_queue *sq, struct netdev_queue *txq, > > + bool use_napi) > > { > > struct sk_buff *skb; > > unsigned int len; > > @@ -1347,6 +1348,9 @@ static void free_old_xmit_skbs(struct send_queue *sq) > > if (!packets) > > return; > > + if (use_napi) > > + netdev_tx_completed_queue(txq, packets, bytes); > > + > > u64_stats_update_begin(&sq->stats.syncp); > > sq->stats.bytes += bytes; > > sq->stats.packets += packets; > > @@ -1364,7 +1368,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) > > return; > > if (__netif_tx_trylock(txq)) { > > - free_old_xmit_skbs(sq); > > + free_old_xmit_skbs(sq, txq, true); > > __netif_tx_unlock(txq); > > } > > @@ -1440,7 +1444,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) > > struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq)); > > __netif_tx_lock(txq, raw_smp_processor_id()); > > - free_old_xmit_skbs(sq); > > + free_old_xmit_skbs(sq, txq, true); > > __netif_tx_unlock(txq); > > virtqueue_napi_complete(napi, sq->vq, 0); > > @@ -1505,13 +1509,15 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > struct send_queue *sq = &vi->sq[qnum]; > > int err; > > struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); > > - bool kick = !skb->xmit_more; > > + bool more = skb->xmit_more; > > bool use_napi = sq->napi.weight; > > + unsigned int bytes = skb->len; > > + bool kick; > > /* Free up any pending old buffers before queueing new ones. */ > > - free_old_xmit_skbs(sq); > > + free_old_xmit_skbs(sq, txq, use_napi); > > - if (use_napi && kick) > > + if (use_napi && !more) > > virtqueue_enable_cb_delayed(sq->vq); > > /* timestamp packet in software */ > > @@ -1552,7 +1558,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > if (!use_napi && > > unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { > > /* More just got used, free them then recheck. */ > > - free_old_xmit_skbs(sq); > > + free_old_xmit_skbs(sq, txq, false); > > if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { > > netif_start_subqueue(dev, qnum); > > virtqueue_disable_cb(sq->vq); > > @@ -1560,7 +1566,12 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > } > > } > > - if (kick || netif_xmit_stopped(txq)) { > > + if (use_napi) > > + kick = __netdev_tx_sent_queue(txq, bytes, more); > > + else > > + kick = !more || netif_xmit_stopped(txq); > > + > > + if (kick) { > > if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) { > > u64_stats_update_begin(&sq->stats.syncp); > > sq->stats.kicks++;