From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
maxime.coquelin@redhat.com, wexu@redhat.com,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH RFC 1/2] virtio-net: bql support
Date: Wed, 2 Jan 2019 08:59:30 -0500 [thread overview]
Message-ID: <20190102085457-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <b4f06d11-4761-dabb-f641-5fc05c1c34fc@redhat.com>
On Wed, Jan 02, 2019 at 11:28:43AM +0800, Jason Wang wrote:
>
> On 2018/12/31 上午2:45, Michael S. Tsirkin wrote:
> > On Thu, Dec 27, 2018 at 06:00:36PM +0800, Jason Wang wrote:
> > > On 2018/12/26 下午11:19, Michael S. Tsirkin wrote:
> > > > On Thu, Dec 06, 2018 at 04:17:36PM +0800, Jason Wang wrote:
> > > > > On 2018/12/6 上午6:54, Michael S. Tsirkin wrote:
> > > > > > When use_napi is set, let's enable BQLs. Note: some of the issues are
> > > > > > similar to wifi. It's worth considering whether something similar to
> > > > > > commit 36148c2bbfbe ("mac80211: Adjust TSQ pacing shift") might be
> > > > > > benefitial.
> > > > > I've played a similar patch several days before. The tricky part is the mode
> > > > > switching between napi and no napi. We should make sure when the packet is
> > > > > sent and trakced by BQL, it should be consumed by BQL as well. I did it by
> > > > > tracking it through skb->cb. And deal with the freeze by reset the BQL
> > > > > status. Patch attached.
> > > > >
> > > > > But when testing with vhost-net, I don't very a stable performance,
> > > > So how about increasing TSQ pacing shift then?
> > >
> > > I can test this. But changing default TCP value is much more than a
> > > virtio-net specific thing.
> > Well same logic as wifi applies. Unpredictable latencies related
> > to radio in one case, to host scheduler in the other.
> >
> > > > > it was
> > > > > probably because we batch the used ring updating so tx interrupt may come
> > > > > randomly. We probably need to implement time bounded coalescing mechanism
> > > > > which could be configured from userspace.
> > > > I don't think it's reasonable to expect userspace to be that smart ...
> > > > Why do we need time bounded? used ring is always updated when ring
> > > > becomes empty.
> > >
> > > We don't add used when means BQL may not see the consumed packet in time.
> > > And the delay varies based on the workload since we count packets not bytes
> > > or time before doing the batched updating.
> > >
> > > Thanks
> > Sorry I still don't get it.
> > When nothing is outstanding then we do update the used.
> > So if BQL stops userspace from sending packets then
> > we get an interrupt and packets start flowing again.
>
>
> Yes, but how about the cases of multiple flows. That's where I see unstable
> results.
>
>
> >
> > It might be suboptimal, we might need to tune it but I doubt running
> > timers is a solution, timer interrupts cause VM exits.
>
>
> Probably not a timer but a time counter (or event byte counter) in vhost to
> add used and signal guest if it exceeds a value instead of waiting the
> number of packets.
>
>
> Thanks
Well we already have VHOST_NET_WEIGHT - is it too big then?
And maybe we should expose the "MORE" flag in the descriptor -
do you think that will help?
>
> >
> > > > > Btw, maybe it's time just enable napi TX by default. I get ~10% TCP_RR
> > > > > regression on machine without APICv, (haven't found time to test APICv
> > > > > machine). But consider it was for correctness, I think it's acceptable? Then
> > > > > we can do optimization on top?
> > > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > ---
> > > > > > drivers/net/virtio_net.c | 27 +++++++++++++++++++--------
> > > > > > 1 file changed, 19 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > index cecfd77c9f3c..b657bde6b94b 100644
> > > > > > --- a/drivers/net/virtio_net.c
> > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > @@ -1325,7 +1325,8 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
> > > > > > return stats.packets;
> > > > > > }
> > > > > > -static void free_old_xmit_skbs(struct send_queue *sq)
> > > > > > +static void free_old_xmit_skbs(struct send_queue *sq, struct netdev_queue *txq,
> > > > > > + bool use_napi)
> > > > > > {
> > > > > > struct sk_buff *skb;
> > > > > > unsigned int len;
> > > > > > @@ -1347,6 +1348,9 @@ static void free_old_xmit_skbs(struct send_queue *sq)
> > > > > > if (!packets)
> > > > > > return;
> > > > > > + if (use_napi)
> > > > > > + netdev_tx_completed_queue(txq, packets, bytes);
> > > > > > +
> > > > > > u64_stats_update_begin(&sq->stats.syncp);
> > > > > > sq->stats.bytes += bytes;
> > > > > > sq->stats.packets += packets;
> > > > > > @@ -1364,7 +1368,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
> > > > > > return;
> > > > > > if (__netif_tx_trylock(txq)) {
> > > > > > - free_old_xmit_skbs(sq);
> > > > > > + free_old_xmit_skbs(sq, txq, true);
> > > > > > __netif_tx_unlock(txq);
> > > > > > }
> > > > > > @@ -1440,7 +1444,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget)
> > > > > > struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq));
> > > > > > __netif_tx_lock(txq, raw_smp_processor_id());
> > > > > > - free_old_xmit_skbs(sq);
> > > > > > + free_old_xmit_skbs(sq, txq, true);
> > > > > > __netif_tx_unlock(txq);
> > > > > > virtqueue_napi_complete(napi, sq->vq, 0);
> > > > > > @@ -1505,13 +1509,15 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > > > struct send_queue *sq = &vi->sq[qnum];
> > > > > > int err;
> > > > > > struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
> > > > > > - bool kick = !skb->xmit_more;
> > > > > > + bool more = skb->xmit_more;
> > > > > > bool use_napi = sq->napi.weight;
> > > > > > + unsigned int bytes = skb->len;
> > > > > > + bool kick;
> > > > > > /* Free up any pending old buffers before queueing new ones. */
> > > > > > - free_old_xmit_skbs(sq);
> > > > > > + free_old_xmit_skbs(sq, txq, use_napi);
> > > > > > - if (use_napi && kick)
> > > > > > + if (use_napi && !more)
> > > > > > virtqueue_enable_cb_delayed(sq->vq);
> > > > > > /* timestamp packet in software */
> > > > > > @@ -1552,7 +1558,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > > > if (!use_napi &&
> > > > > > unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> > > > > > /* More just got used, free them then recheck. */
> > > > > > - free_old_xmit_skbs(sq);
> > > > > > + free_old_xmit_skbs(sq, txq, false);
> > > > > > if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > > > > > netif_start_subqueue(dev, qnum);
> > > > > > virtqueue_disable_cb(sq->vq);
> > > > > > @@ -1560,7 +1566,12 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > > > }
> > > > > > }
> > > > > > - if (kick || netif_xmit_stopped(txq)) {
> > > > > > + if (use_napi)
> > > > > > + kick = __netdev_tx_sent_queue(txq, bytes, more);
> > > > > > + else
> > > > > > + kick = !more || netif_xmit_stopped(txq);
> > > > > > +
> > > > > > + if (kick) {
> > > > > > if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) {
> > > > > > u64_stats_update_begin(&sq->stats.syncp);
> > > > > > sq->stats.kicks++;
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2019-01-02 13:59 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-05 22:54 [PATCH RFC 0/2] virtio-net: interrupt related improvements Michael S. Tsirkin
2018-12-05 22:54 ` [PATCH RFC 1/2] virtio-net: bql support Michael S. Tsirkin
2018-12-06 8:17 ` Jason Wang
2018-12-06 8:31 ` Jason Wang
2018-12-26 15:15 ` Michael S. Tsirkin
2018-12-27 9:56 ` Jason Wang
2018-12-26 15:19 ` Michael S. Tsirkin
2018-12-27 10:00 ` Jason Wang
2018-12-30 18:45 ` Michael S. Tsirkin
2019-01-02 3:28 ` Jason Wang
2019-01-02 13:59 ` Michael S. Tsirkin [this message]
2019-01-07 2:14 ` Jason Wang
2019-01-07 3:17 ` Michael S. Tsirkin
2019-01-07 3:51 ` Jason Wang
2019-01-07 4:01 ` Michael S. Tsirkin
2019-01-07 6:31 ` Jason Wang
2019-01-07 14:19 ` Michael S. Tsirkin
2019-01-08 10:06 ` Jason Wang
2018-12-26 15:22 ` Michael S. Tsirkin
2018-12-27 10:04 ` Jason Wang
2018-12-30 18:48 ` Michael S. Tsirkin
2019-01-02 3:30 ` Jason Wang
2019-01-02 13:54 ` Michael S. Tsirkin
2019-01-17 13:09 ` Jason Wang
2018-12-05 22:54 ` [PATCH RFC 2/2] virtio_net: bulk free tx skbs Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190102085457-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maxime.coquelin@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=wexu@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).