qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Antonios Motakis <a.motakis@virtualopensystems.com>
Cc: Luke Gorrie <lukego@gmail.com>,
	snabb-devel@googlegroups.com,
	Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>,
	qemu-devel qemu-devel <qemu-devel@nongnu.org>,
	VirtualOpenSystems Technical Team <tech@virtualopensystems.com>
Subject: Re: [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends
Date: Wed, 15 Jan 2014 10:54:43 +0200	[thread overview]
Message-ID: <20140115085443.GA1719@redhat.com> (raw)
In-Reply-To: <CAG8rG2wZBVHe70bpC3jOBfQeqb4a2xT-+okh2Rx_jTEcUg9-Pg@mail.gmail.com>

On Tue, Jan 14, 2014 at 07:13:29PM +0100, Antonios Motakis wrote:
> Hello,
> 
> 
> On Tue, Jan 14, 2014 at 12:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> 
>     On Mon, Jan 13, 2014 at 03:25:11PM +0100, Antonios Motakis wrote:
>     > In this patch series we would like to introduce our approach for putting
>     a
>     > virtio-net backend in an external userspace process. Our eventual target
>     is to
>     > run the network backend in the Snabbswitch ethernet switch, while
>     receiving
>     > traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
>     > implementation.
>     >
>     > For this, we are working into extending vhost to allow equivalent
>     functionality
>     > for userspace. Vhost already passes control of the data plane of
>     virtio-net to
>     > the host kernel; we want to realize a similar model, but for userspace.
>     >
>     > In this patch series the concept of a vhost-backend is introduced.
>     >
>     > We define two vhost backend types - vhost-kernel and vhost-user. The
>     former is
>     > the interface to the current kernel module implementation. Its control
>     plane is
>     > ioctl based. The data plane is the kernel directly accessing the QEMU
>     allocated,
>     > guest memory.
>     >
>     > In the new vhost-user backend, the control plane is based on
>     communication
>     > between QEMU and another userspace process using a unix domain socket.
>     This
>     > allows to implement a virtio backend for a guest running in QEMU, inside
>     the
>     > other userspace process.
>     >
>     > We change -mem-path to QemuOpts and add prealloc, share and unlink as
>     properties
>     > to it. HugeTLBFS requirements of -mem-path are relaxed, so any valid path
>     can
>     > be used now.
> 
>     Wait a second. This does not actually work well: if you mmap
>     a random file outside HugeTLBFS, kernel won't create huge pages
>     from this memory so performance of the system as a whole will suffer.
> 
>     You'll have fix the kernel MM before this scheme can fly.
> 
> 
> I'm not sure I completely understand this point. It is up to the user to choose
> not to use HugeTLBFS. Is there a particular problem with the kernel when not
> using it?
>  

Yes. Linux supports transparent huge pages.

A bunch of anonymous pages are recombined into a single huge page,
all this without the drawbacks of HugeTLBFS and with no need for
special priveledges.

This is activated by default for qemu, by this line:

	qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);

this does not have effect for named pages which you get
if you mmap a file from a filesystem, so performance
goes down (unless you use HugeTLBFS but that needs special priveledges).

A different -mem-path having that effect is undesirable and unexpected.

> 
> 
>     > The new properties allow more fine grained control over the guest
>     > RAM backing store.
>     >
>     > The data path is realized by directly accessing the vrings and the buffer
>     data
>     > off the guest's memory.
>     >
>     > The current user of vhost-user is only vhost-net. We add new netdev
>     backend
>     > that is intended to initialize vhost-net with vhost-user backend.
>     >
>     > Example usage:
>     >
>     > qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \
>     >      -netdev type=vhost-user,id=net0,path=/path/to/sock,poll_time=2500 \
>     >      -device virtio-net-pci,netdev=net0
>     >
>     > Changes from v5:
>     >  - Split -mem-path unlink option to a separate patch
>     >  - Fds are passed only in the ancillary data
>     >  - Stricter message size checks on receive/send
>     >  - Netdev vhost-user now includes path and poll_time options
>     >  - The connection probing interval is configurable
>     >
>     > Changes from v4:
>     >  - Use error_report for errors
>     >  - VhostUserMsg has new field `size` indicating the following payload
>     length.
>     >    Field `flags` now has version and reply bits. The structure is packed.
>     >  - Send data is of variable length (`size` field in message)
>     >  - Receive in 2 steps, header and payload
>     >  - Add new message type VHOST_USER_ECHO, to check connection status
>     >
>     > Changes from v3:
>     >  - Convert -mem-path to QemuOpts with prealloc, share and unlink
>     properties
>     >  - Set 1 sec timeout when read/write to the unix domain socket
>     >  - Fix file descriptor leak
>     >
>     > Changes from v2:
>     >  - Reconnect when the backend disappears
>     >
>     > Changes from v1:
>     >  - Implementation of vhost-user netdev backend
>     >  - Code improvements
>     >
>     > Antonios Motakis (8):
>     >   Convert -mem-path to QemuOpts and add prealloc and share properties
>     >   New -mem-path option - unlink.
>     >   Decouple vhost from kernel interface
>     >   Add vhost-user skeleton
>     >   Add domain socket communication for vhost-user backend
>     >   Add vhost-user calls implementation
>     >   Add new vhost-user netdev backend
>     >   Add vhost-user reconnection
>     >
>     >  exec.c                            |  57 +++-
>     >  hmp-commands.hx                   |   4 +-
>     >  hw/net/vhost_net.c                | 144 +++++++---
>     >  hw/net/virtio-net.c               |  42 ++-
>     >  hw/scsi/vhost-scsi.c              |  13 +-
>     >  hw/virtio/Makefile.objs           |   2 +-
>     >  hw/virtio/vhost-backend.c         | 556
>     ++++++++++++++++++++++++++++++++++++++
>     >  hw/virtio/vhost.c                 |  46 ++--
>     >  include/exec/cpu-all.h            |   3 -
>     >  include/hw/virtio/vhost-backend.h |  40 +++
>     >  include/hw/virtio/vhost.h         |   4 +-
>     >  include/net/vhost-user.h          |  17 ++
>     >  include/net/vhost_net.h           |  15 +-
>     >  net/Makefile.objs                 |   2 +-
>     >  net/clients.h                     |   3 +
>     >  net/hub.c                         |   1 +
>     >  net/net.c                         |   2 +
>     >  net/tap.c                         |  16 +-
>     >  net/vhost-user.c                  | 177 ++++++++++++
>     >  qapi-schema.json                  |  21 +-
>     >  qemu-options.hx                   |  24 +-
>     >  vl.c                              |  41 ++-
>     >  22 files changed, 1106 insertions(+), 124 deletions(-)
>     >  create mode 100644 hw/virtio/vhost-backend.c
>     >  create mode 100644 include/hw/virtio/vhost-backend.h
>     >  create mode 100644 include/net/vhost-user.h
>     >  create mode 100644 net/vhost-user.c
>     >
>     > --
>     > 1.8.3.2
>     >
> 
> 

  reply	other threads:[~2014-01-15  8:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-13 14:25 [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 1/8] Convert -mem-path to QemuOpts and add prealloc and share properties Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 2/8] New -mem-path option - unlink Antonios Motakis
2014-01-14 11:16   ` Michael S. Tsirkin
2014-01-14 18:13     ` Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 3/8] Decouple vhost from kernel interface Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 4/8] Add vhost-user skeleton Antonios Motakis
2014-01-14 11:17   ` Michael S. Tsirkin
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 5/8] Add domain socket communication for vhost-user backend Antonios Motakis
2014-01-14 11:10   ` Michael S. Tsirkin
2014-01-14 18:14     ` Antonios Motakis
2014-01-15  9:13       ` Michael S. Tsirkin
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 6/8] Add vhost-user calls implementation Antonios Motakis
2014-01-14 11:21   ` Michael S. Tsirkin
2014-01-14 18:14     ` Antonios Motakis
2014-01-15  9:14       ` Michael S. Tsirkin
2014-01-15 10:08         ` Michael S. Tsirkin
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 7/8] Add new vhost-user netdev backend Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 8/8] Add vhost-user reconnection Antonios Motakis
2014-01-14 11:14 ` [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends Michael S. Tsirkin
2014-01-14 18:13   ` Antonios Motakis
2014-01-15  8:54     ` Michael S. Tsirkin [this message]
2014-01-14 11:33 ` Michael S. Tsirkin
2014-01-14 18:13   ` Antonios Motakis
2014-01-15  9:07     ` Michael S. Tsirkin
2014-01-15 12:50       ` Antonios Motakis
2014-01-15 14:49         ` Michael S. Tsirkin
2014-01-16 12:35           ` Antonios Motakis
2014-01-27 16:37           ` Antonios Motakis
2014-01-27 16:49             ` Michael S. Tsirkin
2014-01-29 12:04               ` Antonios Motakis
2014-01-29 14:10                 ` Michael S. Tsirkin
2014-01-21 13:32       ` Antonios Motakis
2014-01-15 10:05 ` Michael S. Tsirkin
2014-01-21 13:32   ` Antonios Motakis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140115085443.GA1719@redhat.com \
    --to=mst@redhat.com \
    --cc=a.motakis@virtualopensystems.com \
    --cc=lukego@gmail.com \
    --cc=n.nikolaev@virtualopensystems.com \
    --cc=qemu-devel@nongnu.org \
    --cc=snabb-devel@googlegroups.com \
    --cc=tech@virtualopensystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).