All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Antonios Motakis <a.motakis@virtualopensystems.com>
Cc: Luke Gorrie <lukego@gmail.com>,
	snabb-devel@googlegroups.com,
	Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>,
	qemu-devel qemu-devel <qemu-devel@nongnu.org>,
	VirtualOpenSystems Technical Team <tech@virtualopensystems.com>
Subject: Re: [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends
Date: Wed, 15 Jan 2014 10:54:43 +0200	[thread overview]
Message-ID: <20140115085443.GA1719@redhat.com> (raw)
In-Reply-To: <CAG8rG2wZBVHe70bpC3jOBfQeqb4a2xT-+okh2Rx_jTEcUg9-Pg@mail.gmail.com>

On Tue, Jan 14, 2014 at 07:13:29PM +0100, Antonios Motakis wrote:
> Hello,
> 
> 
> On Tue, Jan 14, 2014 at 12:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> 
>     On Mon, Jan 13, 2014 at 03:25:11PM +0100, Antonios Motakis wrote:
>     > In this patch series we would like to introduce our approach for putting
>     a
>     > virtio-net backend in an external userspace process. Our eventual target
>     is to
>     > run the network backend in the Snabbswitch ethernet switch, while
>     receiving
>     > traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
>     > implementation.
>     >
>     > For this, we are working into extending vhost to allow equivalent
>     functionality
>     > for userspace. Vhost already passes control of the data plane of
>     virtio-net to
>     > the host kernel; we want to realize a similar model, but for userspace.
>     >
>     > In this patch series the concept of a vhost-backend is introduced.
>     >
>     > We define two vhost backend types - vhost-kernel and vhost-user. The
>     former is
>     > the interface to the current kernel module implementation. Its control
>     plane is
>     > ioctl based. The data plane is the kernel directly accessing the QEMU
>     allocated,
>     > guest memory.
>     >
>     > In the new vhost-user backend, the control plane is based on
>     communication
>     > between QEMU and another userspace process using a unix domain socket.
>     This
>     > allows to implement a virtio backend for a guest running in QEMU, inside
>     the
>     > other userspace process.
>     >
>     > We change -mem-path to QemuOpts and add prealloc, share and unlink as
>     properties
>     > to it. HugeTLBFS requirements of -mem-path are relaxed, so any valid path
>     can
>     > be used now.
> 
>     Wait a second. This does not actually work well: if you mmap
>     a random file outside HugeTLBFS, kernel won't create huge pages
>     from this memory so performance of the system as a whole will suffer.
> 
>     You'll have fix the kernel MM before this scheme can fly.
> 
> 
> I'm not sure I completely understand this point. It is up to the user to choose
> not to use HugeTLBFS. Is there a particular problem with the kernel when not
> using it?
>  

Yes. Linux supports transparent huge pages.

A bunch of anonymous pages are recombined into a single huge page,
all this without the drawbacks of HugeTLBFS and with no need for
special priveledges.

This is activated by default for qemu, by this line:

	qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);

this does not have effect for named pages which you get
if you mmap a file from a filesystem, so performance
goes down (unless you use HugeTLBFS but that needs special priveledges).

A different -mem-path having that effect is undesirable and unexpected.

> 
> 
>     > The new properties allow more fine grained control over the guest
>     > RAM backing store.
>     >
>     > The data path is realized by directly accessing the vrings and the buffer
>     data
>     > off the guest's memory.
>     >
>     > The current user of vhost-user is only vhost-net. We add new netdev
>     backend
>     > that is intended to initialize vhost-net with vhost-user backend.
>     >
>     > Example usage:
>     >
>     > qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \
>     >      -netdev type=vhost-user,id=net0,path=/path/to/sock,poll_time=2500 \
>     >      -device virtio-net-pci,netdev=net0
>     >
>     > Changes from v5:
>     >  - Split -mem-path unlink option to a separate patch
>     >  - Fds are passed only in the ancillary data
>     >  - Stricter message size checks on receive/send
>     >  - Netdev vhost-user now includes path and poll_time options
>     >  - The connection probing interval is configurable
>     >
>     > Changes from v4:
>     >  - Use error_report for errors
>     >  - VhostUserMsg has new field `size` indicating the following payload
>     length.
>     >    Field `flags` now has version and reply bits. The structure is packed.
>     >  - Send data is of variable length (`size` field in message)
>     >  - Receive in 2 steps, header and payload
>     >  - Add new message type VHOST_USER_ECHO, to check connection status
>     >
>     > Changes from v3:
>     >  - Convert -mem-path to QemuOpts with prealloc, share and unlink
>     properties
>     >  - Set 1 sec timeout when read/write to the unix domain socket
>     >  - Fix file descriptor leak
>     >
>     > Changes from v2:
>     >  - Reconnect when the backend disappears
>     >
>     > Changes from v1:
>     >  - Implementation of vhost-user netdev backend
>     >  - Code improvements
>     >
>     > Antonios Motakis (8):
>     >   Convert -mem-path to QemuOpts and add prealloc and share properties
>     >   New -mem-path option - unlink.
>     >   Decouple vhost from kernel interface
>     >   Add vhost-user skeleton
>     >   Add domain socket communication for vhost-user backend
>     >   Add vhost-user calls implementation
>     >   Add new vhost-user netdev backend
>     >   Add vhost-user reconnection
>     >
>     >  exec.c                            |  57 +++-
>     >  hmp-commands.hx                   |   4 +-
>     >  hw/net/vhost_net.c                | 144 +++++++---
>     >  hw/net/virtio-net.c               |  42 ++-
>     >  hw/scsi/vhost-scsi.c              |  13 +-
>     >  hw/virtio/Makefile.objs           |   2 +-
>     >  hw/virtio/vhost-backend.c         | 556
>     ++++++++++++++++++++++++++++++++++++++
>     >  hw/virtio/vhost.c                 |  46 ++--
>     >  include/exec/cpu-all.h            |   3 -
>     >  include/hw/virtio/vhost-backend.h |  40 +++
>     >  include/hw/virtio/vhost.h         |   4 +-
>     >  include/net/vhost-user.h          |  17 ++
>     >  include/net/vhost_net.h           |  15 +-
>     >  net/Makefile.objs                 |   2 +-
>     >  net/clients.h                     |   3 +
>     >  net/hub.c                         |   1 +
>     >  net/net.c                         |   2 +
>     >  net/tap.c                         |  16 +-
>     >  net/vhost-user.c                  | 177 ++++++++++++
>     >  qapi-schema.json                  |  21 +-
>     >  qemu-options.hx                   |  24 +-
>     >  vl.c                              |  41 ++-
>     >  22 files changed, 1106 insertions(+), 124 deletions(-)
>     >  create mode 100644 hw/virtio/vhost-backend.c
>     >  create mode 100644 include/hw/virtio/vhost-backend.h
>     >  create mode 100644 include/net/vhost-user.h
>     >  create mode 100644 net/vhost-user.c
>     >
>     > --
>     > 1.8.3.2
>     >
> 
> 

  reply	other threads:[~2014-01-15  8:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-13 14:25 [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 1/8] Convert -mem-path to QemuOpts and add prealloc and share properties Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 2/8] New -mem-path option - unlink Antonios Motakis
2014-01-14 11:16   ` Michael S. Tsirkin
2014-01-14 18:13     ` Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 3/8] Decouple vhost from kernel interface Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 4/8] Add vhost-user skeleton Antonios Motakis
2014-01-14 11:17   ` Michael S. Tsirkin
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 5/8] Add domain socket communication for vhost-user backend Antonios Motakis
2014-01-14 11:10   ` Michael S. Tsirkin
2014-01-14 18:14     ` Antonios Motakis
2014-01-15  9:13       ` Michael S. Tsirkin
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 6/8] Add vhost-user calls implementation Antonios Motakis
2014-01-14 11:21   ` Michael S. Tsirkin
2014-01-14 18:14     ` Antonios Motakis
2014-01-15  9:14       ` Michael S. Tsirkin
2014-01-15 10:08         ` Michael S. Tsirkin
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 7/8] Add new vhost-user netdev backend Antonios Motakis
2014-01-13 14:25 ` [Qemu-devel] [PATCH v6 8/8] Add vhost-user reconnection Antonios Motakis
2014-01-14 11:14 ` [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends Michael S. Tsirkin
2014-01-14 18:13   ` Antonios Motakis
2014-01-15  8:54     ` Michael S. Tsirkin [this message]
2014-01-14 11:33 ` Michael S. Tsirkin
2014-01-14 18:13   ` Antonios Motakis
2014-01-15  9:07     ` Michael S. Tsirkin
2014-01-15 12:50       ` Antonios Motakis
2014-01-15 14:49         ` Michael S. Tsirkin
2014-01-16 12:35           ` Antonios Motakis
2014-01-27 16:37           ` Antonios Motakis
2014-01-27 16:49             ` Michael S. Tsirkin
2014-01-29 12:04               ` Antonios Motakis
2014-01-29 14:10                 ` Michael S. Tsirkin
2014-01-21 13:32       ` Antonios Motakis
2014-01-15 10:05 ` Michael S. Tsirkin
2014-01-21 13:32   ` Antonios Motakis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140115085443.GA1719@redhat.com \
    --to=mst@redhat.com \
    --cc=a.motakis@virtualopensystems.com \
    --cc=lukego@gmail.com \
    --cc=n.nikolaev@virtualopensystems.com \
    --cc=qemu-devel@nongnu.org \
    --cc=snabb-devel@googlegroups.com \
    --cc=tech@virtualopensystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.