From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46234) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VYCjS-0004tP-DK for qemu-devel@nongnu.org; Mon, 21 Oct 2013 06:29:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VYCjQ-0002Mj-Q3 for qemu-devel@nongnu.org; Mon, 21 Oct 2013 06:29:06 -0400 Received: from mail-bk0-x235.google.com ([2a00:1450:4008:c01::235]:59601) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VYCjQ-0002MC-Ds for qemu-devel@nongnu.org; Mon, 21 Oct 2013 06:29:04 -0400 Received: by mail-bk0-f53.google.com with SMTP id d7so648068bkh.40 for ; Mon, 21 Oct 2013 03:29:02 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20130528115843.GB15905@stefanha-thinkpad.redhat.com> References: <20130527093409.GH21969@stefanha-thinkpad.redhat.com> <20130528115843.GB15905@stefanha-thinkpad.redhat.com> Date: Mon, 21 Oct 2013 12:29:02 +0200 Message-ID: From: Luke Gorrie Content-Type: multipart/alternative; boundary=001a11c31f20bfbc3004e93dc007 Subject: Re: [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: "snabb-devel@googlegroups.com" , qemu-devel , "Michael S. Tsirkin" --001a11c31f20bfbc3004e93dc007 Content-Type: text/plain; charset=ISO-8859-1 Hi all, Back in May we talked about efficiently connecting a user-space Ethernet switch to QEMU guests. Stefan Hajnoczi sketched the design of a userspace version of vhost that uses a Unix socket for its control interface. His design is in the mail quoted below. I'd like to ask you: if this feature were properly implemented and maintained, would you guys accept it into qemu? If so then I will work with a good QEMU hacker to develop it. also, have there been any new developments in this area (vhost-net and userspace ethernet I/O) that we should take into account? On 28 May 2013 13:58, Stefan Hajnoczi wrote: > On Tue, May 28, 2013 at 12:10:50PM +0200, Luke Gorrie wrote: > > On 27 May 2013 11:34, Stefan Hajnoczi wrote: > > > > > vhost_net is about connecting the a virtio-net speaking process to a > > > tun-like device. The problem you are trying to solve is connecting a > > > virtio-net speaking process to Snabb Switch. > > > > > > > Yep! > > > > > > > Either you need to replace vhost or you need a tun-like device > > > interface. > > > > > > Replacing vhost would mean that your switch implements virtio-net, > > > shares guest RAM with the guest, and shares the ioeventfd and irqfd > > > which are used to signal with the guest. > > > > > > This would be a great solution from my perspective. This is the design > that > > I am now struggling to find a good implementation strategy for. > > The switch needs 3 resources for direct virtio-net communication with > the guest: > > 1. Shared memory access to guest physical memory for guest physical to > host userspace address translation. vhost and data plane > automatically guest access to guest memory and they learn about > memory layout using the MemoryListener interface in QEMU (see > hw/virtio/vhost.c:vhost_region_add() and friends). > > 2. Virtqueue kick notifier (ioeventfd) so the switch knows when the > guest signals the host. See virtio_queue_get_host_notifier(vq). > > 3. Guest interrupt notifier (irqfd) so the switch can signal the guest. > See virtio_queue_get_guest_notifier(vq). > > I don't have a detailed suggestion for how to interface the switch and > QEMU processes. It may be necessary to communicate back and forth (to > handle the virtio device lifecycle) so a UNIX domain socket would be > appropriate for passing file descriptors. Here is a rough idea: > > $ switch --listen-path=/var/run/switch.sock > $ qemu --device virtio-net-pci,switch=/var/run/switch.sock > > On QEMU startup: > > (switch socket) add_port --id="qemu-$PID" --session-persistence > > (Here --session-persistence means that the port will be automatically > destroyed if the switch socket session is terminated because the UNIX > domain socket is closed by QEMU.) > > On virtio device status transition to DRIVER_OK: > > (switch socket) configure_port --id="qemu-$PID" > --mem=/tmp/shm/qemu-$PID > --ioeventfd=2 > --irqfd=3 > > On virtio device status transition from DRIVER_OK: > > (switch socket) deconfigure_port --id="qemu-$PID" > > I skipped a bunch of things: > > 1. virtio-net has several virtqueues so you need multiple ioeventfds. > > 2. QEMU needs to communicate memory mapping information, this gets > especially interesting with memory hotplug. Memory is more > complicated than a single shmem blob. > > 3. Multiple NICs per guest should be supported. > > Stefan > --001a11c31f20bfbc3004e93dc007 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi all,

Back in May we talked abo= ut efficiently connecting a user-space Ethernet switch to QEMU guests. Stef= an Hajnoczi sketched the design of a userspace version of vhost that uses a= Unix socket for its control interface. His design is in the mail quoted be= low.

I'd like to ask you: if this feature we= re properly implemented and maintained, would you guys accept it into qemu?=

If so then I will work with a good QE= MU hacker to develop it.

also, have there been any new developments = in this area (vhost-net and userspace ethernet I/O) that we should take int= o account?

On 28 May 2013 13:58, Stefan Hajnoczi <stefanha@redhat.com> wrote:
On Tue, May 28, 2013 at 12:10:50PM +0200, Luke Gorrie wro= te:
> On 27 May 2013 11:34, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> > vhost_net is about connecting the a virtio-net speaking process t= o a
> > tun-like device. =A0The problem you are trying to solve is connec= ting a
> > virtio-net speaking process to Snabb Switch.
> >
>
> Yep!
>
>
> > Either you need to replace vhost or you need a tun-like device > > interface.
> >
> > Replacing vhost would mean that your switch implements virtio-net= ,
> > shares guest RAM with the guest, and shares the ioeventfd and irq= fd
> > which are used to signal with the guest.
>
>
> This would be a great solution from my perspective. This is the design= that
> I am now struggling to find a good implementation strategy for.

The switch needs 3 resources for direct virtio-net communication with=
the guest:

1. Shared memory access to guest physical memory for guest physical to
=A0 =A0host userspace address translation. =A0vhost and data plane
=A0 =A0automatically guest access to guest memory and they learn about
=A0 =A0memory layout using the MemoryListener interface in QEMU (see
=A0 =A0hw/virtio/vhost.c:vhost_region_add() and friends).

2. Virtqueue kick notifier (ioeventfd) so the switch knows when the
=A0 =A0guest signals the host. =A0See virtio_queue_get_host_notifier(vq).
3. Guest interrupt notifier (irqfd) so the switch can signal the guest.
=A0 =A0See virtio_queue_get_guest_notifier(vq).

I don't have a detailed suggestion for how to interface the switch and<= br> QEMU processes. =A0It may be necessary to communicate back and forth (to handle the virtio device lifecycle) so a UNIX domain socket would be
appropriate for passing file descriptors. =A0Here is a rough idea:

$ switch --listen-path=3D/var/run/switch.sock
$ qemu --device virtio-net-pci,switch=3D/var/run/switch.sock

On QEMU startup:

(switch socket) add_port --id=3D"qemu-$PID" --session-persistence=

(Here --session-persistence means that the port will be automatically
destroyed if the switch socket session is terminated because the UNIX
domain socket is closed by QEMU.)

On virtio device status transition to DRIVER_OK:

(switch socket) configure_port --id=3D"qemu-$PID"
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0--mem=3D/tmp= /shm/qemu-$PID
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0--ioeventfd= =3D2
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0--irqfd=3D3<= br>
On virtio device status transition from DRIVER_OK:

(switch socket) deconfigure_port --id=3D"qemu-$PID"

I skipped a bunch of things:

1. virtio-net has several virtqueues so you need multiple ioeventfds.

2. QEMU needs to communicate memory mapping information, this gets
=A0 =A0especially interesting with memory hotplug. =A0Memory is more
=A0 =A0complicated than a single shmem blob.

3. Multiple NICs per guest should be supported.

Stefan

--001a11c31f20bfbc3004e93dc007--