Linux virtualization list
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Rik van Riel <riel@redhat.com>, Andrew Jones <drjones@redhat.com>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	Luke Gorrie <luke@snabb.co>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [virtio-dev] Zerocopy VM-to-VM networking using virtio-net
Date: Mon, 27 Apr 2015 16:40:51 +0200	[thread overview]
Message-ID: <20150427163920-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <553E480B.1020502@siemens.com>

On Mon, Apr 27, 2015 at 04:30:35PM +0200, Jan Kiszka wrote:
> Am 2015-04-27 um 15:01 schrieb Stefan Hajnoczi:
> > On Mon, Apr 27, 2015 at 1:55 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >> Am 2015-04-27 um 14:35 schrieb Jan Kiszka:
> >>> Am 2015-04-27 um 12:17 schrieb Stefan Hajnoczi:
> >>>> On Sun, Apr 26, 2015 at 2:24 PM, Luke Gorrie <luke@snabb.co> wrote:
> >>>>> On 24 April 2015 at 15:22, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >>>>>>
> >>>>>> The motivation for making VM-to-VM fast is that while software
> >>>>>> switches on the host are efficient today (thanks to vhost-user), there
> >>>>>> is no efficient solution if the software switch is a VM.
> >>>>>
> >>>>>
> >>>>> I see. This sounds like a noble goal indeed. I would love to run the
> >>>>> software switch as just another VM in the long term. It would make it much
> >>>>> easier for the various software switches to coexist in the world.
> >>>>>
> >>>>> The main technical risk I see in this proposal is that eliminating the
> >>>>> memory copies might not have the desired effect. I might be tempted to keep
> >>>>> the copies but prevent the kernel from having to inspect the vrings (more
> >>>>> like vhost-user). But that is just a hunch and I suppose the first step
> >>>>> would be a prototype to check the performance anyway.
> >>>>>
> >>>>> For what it is worth here is my view of networking performance on x86 in the
> >>>>> Haswell+ era:
> >>>>> https://groups.google.com/forum/#!topic/snabb-devel/aez4pEnd4ow
> >>>>
> >>>> Thanks.
> >>>>
> >>>> I've been thinking about how to eliminate the VM <-> host <-> VM
> >>>> switching and instead achieve just VM <-> VM.
> >>>>
> >>>> The holy grail of VM-to-VM networking is an exitless I/O path.  In
> >>>> other words, packets can be transferred between VMs without any
> >>>> vmexits (this requires a polling driver).
> >>>>
> >>>> Here is how it works.  QEMU gets "-device vhost-user" so that a VM can
> >>>> act as the vhost-user server:
> >>>>
> >>>> VM1 (virtio-net guest driver) <-> VM2 (vhost-user device)
> >>>>
> >>>> VM1 has a regular virtio-net PCI device.  VM2 has a vhost-user device
> >>>> and plays the host role instead of the normal virtio-net guest driver
> >>>> role.
> >>>>
> >>>> The ugly thing about this is that VM2 needs to map all of VM1's guest
> >>>> RAM so it can access the vrings and packet data.  The solution to this
> >>>> is something like the Shared Buffers BAR but this time it contains not
> >>>> just the packet data but also the vring, let's call it the Shared
> >>>> Virtqueues BAR.
> >>>>
> >>>> The Shared Virtqueues BAR eliminates the need for vhost-net on the
> >>>> host because VM1 and VM2 communicate directly using virtqueue notify
> >>>> or polling vring memory.  Virtqueue notify works by connecting an
> >>>> eventfd as ioeventfd in VM1 and irqfd in VM2.  And VM2 would also have
> >>>> an ioeventfd that is irqfd for VM1 to signal completions.
> >>>
> >>> We had such a discussion before:
> >>> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/123014/focus=279658
> >>>
> >>> Would be great to get this ball rolling again.
> >>>
> >>> Jan
> >>>
> >>
> >> But one challenge would remain even then (unless both sides only poll):
> >> exit-free inter-VM signaling, no? But that's a hardware issue first of all.
> > 
> > To start with ioeventfd<->irqfd can be used.  It incurs a light-weight
> > exit in VM1 and interrupt injection in VM2.
> > 
> > For networking the cost is mitigated by NAPI drivers which switch
> > between interrupts and polling.  During notification-heavy periods the
> > guests would use polling anyway.
> > 
> > A hardware solution would be some kind of inter-guest interrupt
> > injection.  I don't know VMX well enough to know whether that is
> > possible on Intel CPUs.
> 
> Today, we have posted interrupts to avoid the vm-exit on the target CPU,
> but there is nothing yet (to my best knowledge) to avoid the exit on the
> sender side (unless we ignore security). That's the same problem with
> intra-guest IPIs, BTW.
> 
> For throughput and given NAPI patterns, that's probably not an issue as
> you noted. It may be for latency, though, when almost every cycle counts.
> 
> Jan

If you are counting cycles you likely can't afford the
interrupt latency under linux, so you have to poll
memory.

> -- 
> Siemens AG, Corporate Technology, CT RTC ITP SES-DE
> Corporate Competence Center Embedded Linux

  parent reply	other threads:[~2015-04-27 14:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150422170138.GA8388@stefanha-thinkpad.redhat.com>
2015-04-22 17:46 ` Zerocopy VM-to-VM networking using virtio-net Cornelia Huck
     [not found] ` <20150422194603.1e650ec7.cornelia.huck@de.ibm.com>
2015-04-22 18:00   ` Stefan Hajnoczi
2015-04-23 16:54     ` Cornelia Huck
2015-04-24  8:12 ` [virtio-dev] " Luke Gorrie
2015-04-24  8:20   ` Paolo Bonzini
2015-04-24  9:47   ` Stefan Hajnoczi
2015-04-24  9:50     ` Stefan Hajnoczi
2015-04-24 12:17     ` Luke Gorrie
2015-04-24 13:10       ` Luke Gorrie
2015-04-24 13:23         ` Stefan Hajnoczi
2015-04-24 13:22       ` Stefan Hajnoczi
2015-04-26 13:24         ` Luke Gorrie
2015-04-27 10:17           ` Stefan Hajnoczi
2015-04-27 10:36             ` Michael S. Tsirkin
2015-04-27 12:35             ` Jan Kiszka
2015-04-27 12:55               ` Jan Kiszka
2015-04-27 13:01                 ` Stefan Hajnoczi
2015-04-27 13:08                   ` Muli Ben-Yehuda
2015-04-27 14:30                   ` Jan Kiszka
2015-04-27 14:36                     ` Luke Gorrie
2015-04-27 14:38                       ` Jan Kiszka
2015-04-27 14:40                     ` Michael S. Tsirkin [this message]
2015-04-27 12:57               ` Stefan Hajnoczi
2015-04-27 13:17               ` Michael S. Tsirkin
2015-04-24 12:34     ` Luke Gorrie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150427163920-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=drjones@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=luke@snabb.co \
    --cc=pbonzini@redhat.com \
    --cc=riel@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox