public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark McLoughlin <markmc@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH 0/6] Kill off the virtio_net tx mitigation timer
Date: Mon, 03 Nov 2008 15:04:54 +0000	[thread overview]
Message-ID: <1225724694.5904.63.camel@blaa> (raw)
In-Reply-To: <490EF141.8040005@redhat.com>

On Mon, 2008-11-03 at 14:40 +0200, Avi Kivity wrote:
> Mark McLoughlin wrote:
> > On Sun, 2008-11-02 at 11:48 +0200, Avi Kivity wrote:
> >   
> >> Mark McLoughlin wrote:
> >>> The main patch in this series is 5/6 - it just kills off the
> >>> virtio_net tx mitigation timer and does all the tx I/O in the
> >>> I/O thread.
> >>>
> >>>   
> >>>       
> >> What will it do to small packet, multi-flow loads (simulated by ping -f 
> >> -l 30 $external)?
> >>     
> >
> > It should improve the latency - the packets will be flushed more quickly
> > than the 150us timeout without blocking the guest.
> >
> >   
> 
> But it will increase overhead, since suddenly we aren't queueing 
> anymore.  One vmexit per small packet.

Yes in theory, but the packet copies are acting to mitigate exits since
we don't re-enable notifications again until we're sure the ring is
empty.

With copyless, though, we'd have an unacceptable vmexit rate.

> >> Where does the benefit come from?
> >>     
> >
> > There are two things going on here, I think.
> >
> > First is that the timer affects latency, removing the timeout helps
> > that.
> >   
> 
> If the timer affects latency, then something is very wrong.  We're 
> lacking an adjustable window.
> 
> The way I see it, the notification window should be adjusted according 
> to the current workload.  If the link is idle, the window should be one 
> packet -- notify as soon as something is queued.  As the workload 
> increases, the window increases to (safety_factor * allowable_latency / 
> packet_rate).  The timer is set to allowable_latency to catch changes in 
> workload.
> 
> For example:
> 
> - allowable_latency 1ms (implies 1K vmexits/sec desired)
> - current packet_rate 20K packets/sec
> - safety_factor 0.8
> 
> So we request notifications every 0.8 * 20K * 1m = 16 packets, and set 
> the timer to 1ms.  Usually we get a notification every 16 packets, just 
> before timer expiration.  If the workload increases, we get 
> notifications sooner, so we increase the window.  If the workload drops, 
> the timer fires and we decrease the window.
> 
> The timer should never fire on an all-out benchmark, or in a ping test.

Yeah, I do like the sound of this.

However, since it requires a new guest feature and I don't expect it'll
improve the situation over the proposed patch until we have copyless
transmit, I think we should do this as part of the copyless effort.

One thing I'd worry about with this scheme is all-out receive - e.g. any
delay in returning a TCP ACK to the sending side, might cause us to hit
the TCP window size.

> > Second is that currently when we fill up the ring we block the guest
> > vcpu and flush. Thus, while we're copying a entire ring full of packets
> > that guest isn't making progress. Doing the copying in the I/O thread
> > helps there.
> >   
> 
> We're hurting our cache, and this won't work well with many nics.  At 
> the very least this should be done in a dedicated thread.

A thread per nic is doable, but it'd be especially tricky on the receive
side without more "short-cut the one producer, one consumer case" work.

Cheers,
Mark.


  reply	other threads:[~2008-11-03 15:05 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-30 17:51 [PATCH 0/6] Kill off the virtio_net tx mitigation timer Mark McLoughlin
2008-10-30 17:51 ` [PATCH 1/6] kvm: qemu: virtio: remove unused variable Mark McLoughlin
2008-10-30 17:51   ` [PATCH 2/6] kvm: qemu: dup the qemu_eventfd() return Mark McLoughlin
2008-10-30 17:51     ` [PATCH 3/6] kvm: qemu: add qemu_eventfd_write() and qemu_eventfd_read() Mark McLoughlin
2008-10-30 17:51       ` [PATCH 4/6] kvm: qemu: aggregate reads from eventfd Mark McLoughlin
2008-10-30 17:51         ` [PATCH 5/6] kvm: qemu: virtio-net: handle all tx in I/O thread without timer Mark McLoughlin
2008-10-30 17:51           ` [PATCH 6/6] kvm: qemu: virtio-net: drop mutex during tx tapfd write Mark McLoughlin
2008-11-04 11:43             ` Avi Kivity
2008-10-30 19:24           ` [PATCH 5/6] kvm: qemu: virtio-net: handle all tx in I/O thread without timer Anthony Liguori
2008-10-31  9:16             ` Mark McLoughlin
2008-11-03 15:07               ` Mark McLoughlin
2008-11-02  9:56           ` Avi Kivity
2008-11-04 15:23           ` David S. Ahern
2008-11-06 17:02             ` Mark McLoughlin
2008-11-06 17:13               ` David S. Ahern
2008-11-06 17:43               ` Avi Kivity
2008-10-30 19:20 ` [PATCH 0/6] Kill off the virtio_net tx mitigation timer Anthony Liguori
2008-11-02  9:48 ` Avi Kivity
2008-11-03 12:23   ` Mark McLoughlin
2008-11-03 12:40     ` Avi Kivity
2008-11-03 15:04       ` Mark McLoughlin [this message]
2008-11-03 15:19         ` Avi Kivity
2008-11-06 16:46           ` Mark McLoughlin
2008-11-06 17:38             ` Avi Kivity
2008-11-06 17:45       ` Mark McLoughlin
2008-11-09 11:29         ` Avi Kivity
2008-11-02  9:57 ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1225724694.5904.63.camel@blaa \
    --to=markmc@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox