All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shirley Ma <mashirle@us.ibm.com>,
	Herbert Xu <herbert@gondor.hengli.com.au>,
	davem@davemloft.net, kvm@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH 2/2] virtio_net: remove send completion interrupts and avoid TX queue overrun through packet drop
Date: Sun, 27 Mar 2011 09:52:54 +0200	[thread overview]
Message-ID: <20110327075254.GA3776@redhat.com> (raw)
In-Reply-To: <87mxkjls61.fsf@rustcorp.com.au>

On Fri, Mar 25, 2011 at 03:20:46PM +1030, Rusty Russell wrote:
> > 3. For TX sometimes we free a single buffer, sometimes
> >    a ton of them, which might make the transmit latency
> >    vary. It's probably a good idea to limit this,
> >    maybe free the minimal number possible to keep the device
> >    going without stops, maybe free up to MAX_SKB_FRAGS.
> 
> This kind of heuristic is going to be quite variable depending on
> circumstance, I think, so it's a lot of work to make sure we get it
> right.

Hmm, trying to keep the amount of work per descriptor
constant seems to make sense though, no?
Latency variations are not good for either RT uses or
protocols such as TCP.

> > 4. If the ring is full, we now notify right after
> >    the first entry is consumed. For TX this is suboptimal,
> >    we should try delaying the interrupt on host.
> 
> Lguest already does that: only sends an interrupt when it's run out of
> things to do.  It does update the used ring, however, as it processes
> them.

There are many approaches here I suspect something like
interrupt after half work is done might be better for
parallelism.

> 
> This seems sensible to me, but needs to be measured separately as well.

Agree.

> > More ideas, would be nice if someone can try them out:
> > 1. We are allocating/freeing buffers for indirect descriptors.
> >    Use some kind of pool instead?
> >    And we could preformat part of the descriptor.
> 
> We need some poolish mechanism for virtio_blk too; perhaps an allocation
> callback which both can use (virtio_blk to alloc from a pool, virtio_net
> to recycle?).

BTW for recycling, need to be careful about numa effects:
probably store cpu id and reallocate if we switch cpus ...
(or noma nodes - unfortunately not always described correctly).

> Along similar lines to preformatting, we could actually try to prepend
> the skb_vnet_hdr to the vnet data, and use a single descriptor for the
> hdr and the first part of the packet.
> 
> Though IIRC, qemu's virtio barfs if the first descriptor isn't just the
> hdr (barf...).

Maybe we can try fixing this before adding more flags,
then e.g. publish used flag can be resued to also
tell us layout is flexible. Or just add a feature flag for that.

> > 2. I didn't have time to work on virtio2 ideas presented
> >    at the kvm forum yet, any takers?
> 
> I didn't even attend.

Hmm, right. But what was presented there was discussed on list as well:
a single R/W descriptor ring with valid bit instead of 2 rings
+ a descriptor array.

>  But I think that virtio is moribund for the
> moment; there wasn't enough demand and it's clear that there are
> optimization unexplored in virtio1.

I agree absolutely that not all lessons has been learned,
playing with different ring layouts would make at least
an interesting paper IMO.

-- 
MST

  reply	other threads:[~2011-03-27  7:53 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-17  0:12 [PATCH 2/2] virtio_net: remove send completion interrupts and avoid TX queue overrun through packet drop Shirley Ma
2011-03-17  5:02 ` Michael S. Tsirkin
2011-03-17 15:18   ` Shirley Ma
2011-03-18  3:28     ` Shirley Ma
2011-03-18 13:15       ` Michael S. Tsirkin
2011-03-18 16:54         ` Shirley Ma
2011-03-17  5:10 ` Rusty Russell
2011-03-17 15:10   ` Shirley Ma
2011-03-18 13:33 ` Herbert Xu
2011-03-19  1:41   ` Shirley Ma
2011-03-21 18:03     ` Shirley Ma
2011-03-22 11:36       ` Michael S. Tsirkin
2011-03-23  2:26         ` Shirley Ma
2011-03-24  0:30           ` Rusty Russell
2011-03-24  4:14             ` Shirley Ma
2011-03-24 14:28             ` Michael S. Tsirkin
2011-03-24 17:46               ` Shirley Ma
2011-03-24 18:10                 ` Michael S. Tsirkin
2011-03-25  4:51                 ` Rusty Russell
2011-03-25  4:50               ` Rusty Russell
2011-03-27  7:52                 ` Michael S. Tsirkin [this message]
2011-04-04  6:13                   ` Rusty Russell
2011-03-24  0:16         ` Rusty Russell
2011-03-24  6:39           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110327075254.GA3776@redhat.com \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.hengli.com.au \
    --cc=kvm@vger.kernel.org \
    --cc=mashirle@us.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.