From: "Michael S. Tsirkin" <mst@redhat.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shirley Ma <mashirle@us.ibm.com>,
Herbert Xu <herbert@gondor.hengli.com.au>,
davem@davemloft.net, kvm@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH 2/2] virtio_net: remove send completion interrupts and avoid TX queue overrun through packet drop
Date: Sun, 27 Mar 2011 09:52:54 +0200 [thread overview]
Message-ID: <20110327075254.GA3776@redhat.com> (raw)
In-Reply-To: <87mxkjls61.fsf@rustcorp.com.au>
On Fri, Mar 25, 2011 at 03:20:46PM +1030, Rusty Russell wrote:
> > 3. For TX sometimes we free a single buffer, sometimes
> > a ton of them, which might make the transmit latency
> > vary. It's probably a good idea to limit this,
> > maybe free the minimal number possible to keep the device
> > going without stops, maybe free up to MAX_SKB_FRAGS.
>
> This kind of heuristic is going to be quite variable depending on
> circumstance, I think, so it's a lot of work to make sure we get it
> right.
Hmm, trying to keep the amount of work per descriptor
constant seems to make sense though, no?
Latency variations are not good for either RT uses or
protocols such as TCP.
> > 4. If the ring is full, we now notify right after
> > the first entry is consumed. For TX this is suboptimal,
> > we should try delaying the interrupt on host.
>
> Lguest already does that: only sends an interrupt when it's run out of
> things to do. It does update the used ring, however, as it processes
> them.
There are many approaches here I suspect something like
interrupt after half work is done might be better for
parallelism.
>
> This seems sensible to me, but needs to be measured separately as well.
Agree.
> > More ideas, would be nice if someone can try them out:
> > 1. We are allocating/freeing buffers for indirect descriptors.
> > Use some kind of pool instead?
> > And we could preformat part of the descriptor.
>
> We need some poolish mechanism for virtio_blk too; perhaps an allocation
> callback which both can use (virtio_blk to alloc from a pool, virtio_net
> to recycle?).
BTW for recycling, need to be careful about numa effects:
probably store cpu id and reallocate if we switch cpus ...
(or noma nodes - unfortunately not always described correctly).
> Along similar lines to preformatting, we could actually try to prepend
> the skb_vnet_hdr to the vnet data, and use a single descriptor for the
> hdr and the first part of the packet.
>
> Though IIRC, qemu's virtio barfs if the first descriptor isn't just the
> hdr (barf...).
Maybe we can try fixing this before adding more flags,
then e.g. publish used flag can be resued to also
tell us layout is flexible. Or just add a feature flag for that.
> > 2. I didn't have time to work on virtio2 ideas presented
> > at the kvm forum yet, any takers?
>
> I didn't even attend.
Hmm, right. But what was presented there was discussed on list as well:
a single R/W descriptor ring with valid bit instead of 2 rings
+ a descriptor array.
> But I think that virtio is moribund for the
> moment; there wasn't enough demand and it's clear that there are
> optimization unexplored in virtio1.
I agree absolutely that not all lessons has been learned,
playing with different ring layouts would make at least
an interesting paper IMO.
--
MST
next prev parent reply other threads:[~2011-03-27 7:52 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-17 0:12 [PATCH 2/2] virtio_net: remove send completion interrupts and avoid TX queue overrun through packet drop Shirley Ma
2011-03-17 5:02 ` Michael S. Tsirkin
2011-03-17 15:18 ` Shirley Ma
2011-03-18 3:28 ` Shirley Ma
2011-03-18 13:15 ` Michael S. Tsirkin
2011-03-18 16:54 ` Shirley Ma
2011-03-17 5:10 ` Rusty Russell
2011-03-17 15:10 ` Shirley Ma
2011-03-18 13:33 ` Herbert Xu
2011-03-19 1:41 ` Shirley Ma
2011-03-21 18:03 ` Shirley Ma
2011-03-22 11:36 ` Michael S. Tsirkin
2011-03-23 2:26 ` Shirley Ma
2011-03-24 0:30 ` Rusty Russell
2011-03-24 4:14 ` Shirley Ma
2011-03-24 14:28 ` Michael S. Tsirkin
2011-03-24 17:46 ` Shirley Ma
2011-03-24 18:10 ` Michael S. Tsirkin
2011-03-25 4:51 ` Rusty Russell
2011-03-25 4:50 ` Rusty Russell
2011-03-27 7:52 ` Michael S. Tsirkin [this message]
2011-04-04 6:13 ` Rusty Russell
2011-03-24 0:16 ` Rusty Russell
2011-03-24 6:39 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110327075254.GA3776@redhat.com \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.hengli.com.au \
--cc=kvm@vger.kernel.org \
--cc=mashirle@us.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).