qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [PATCH-RFC 13/13] virtio-net: connect to vhost net backend
Date: Mon, 25 Jan 2010 13:58:08 -0600	[thread overview]
Message-ID: <4B5DF7D0.7050609@codemonkey.ws> (raw)
In-Reply-To: <20100111172326.GB12084@redhat.com>

On 01/11/2010 11:23 AM, Michael S. Tsirkin wrote:
> start/stop backend on driver start/stop
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> ---
>   hw/virtio-net.c |   40 ++++++++++++++++++++++++++++++++++++++++
>   1 files changed, 40 insertions(+), 0 deletions(-)
>
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index c2a389f..99169e1 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -17,6 +17,7 @@
>   #include "net/tap.h"
>   #include "qemu-timer.h"
>   #include "virtio-net.h"
> +#include "vhost_net.h"
>
>   #define VIRTIO_NET_VM_VERSION    11
>
> @@ -47,6 +48,7 @@ typedef struct VirtIONet
>       uint8_t nomulti;
>       uint8_t nouni;
>       uint8_t nobcast;
> +    uint8_t vhost_started;
>       struct {
>           int in_use;
>           int first_multi;
> @@ -114,6 +116,10 @@ static void virtio_net_reset(VirtIODevice *vdev)
>       n->nomulti = 0;
>       n->nouni = 0;
>       n->nobcast = 0;
> +    if (n->vhost_started) {
> +        vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
> +        n->vhost_started = 0;
> +    }
>
>       /* Flush any MAC and VLAN filter table state */
>       n->mac_table.in_use = 0;
> @@ -820,6 +826,36 @@ static NetClientInfo net_virtio_info = {
>       .link_status_changed = virtio_net_set_link_status,
>   };
>
> +static void virtio_net_set_status(struct VirtIODevice *vdev)
> +{
> +    VirtIONet *n = to_virtio_net(vdev);
> +    if (!n->nic->nc.peer) {
> +        return;
> +    }
> +    if (n->nic->nc.peer->info->type != NET_CLIENT_TYPE_TAP) {
> +        return;
> +    }
> +
> +    if (!tap_get_vhost_net(n->nic->nc.peer)) {
> +        return;
> +    }
> +    if (!!n->vhost_started == !!(vdev->status&  VIRTIO_CONFIG_S_DRIVER_OK)) {
> +        return;
> +    }
> +    if (vdev->status&  VIRTIO_CONFIG_S_DRIVER_OK) {
> +        int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), vdev);
> +        if (r<  0) {
> +            fprintf(stderr, "unable to start vhost net: "
> +                    "falling back on userspace virtio\n");
> +        } else {
> +            n->vhost_started = 1;
> +        }
> +    } else {
> +        vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
> +        n->vhost_started = 0;
> +    }
> +}
> +
>    

This function does a number of bad things.  It makes virtio-net have 
specific knowledge of backends (like tap) and then has virtio-net pass 
device specific state (vdev) to a network backend.

Ultimately, the following things need to happen:

1) when a driver is ready to begin operating:
   a) virtio-net needs to tell vhost the location of the ring in 
physical memory
   b) virtio-net needs to tell vhost about any notification it receives 
(allowing kvm to shortcut userspace)
   c) virtio-net needs to tell vhost about which irq is being used 
(allowing kvm to shortcut userspace)

What this suggests is that we need an API for the network backends to do 
the following:

  1) probe whether ring passthrough is supported
  2) set the *virtual* address of the ring elements
  3) hand the backend some sort of notification object for sending and 
receiving notifications
  4) stop ring passthrough

vhost should not need any direct knowledge of the device.  This 
interface should be enough to communicating the required data.  I think 
the only bit that is a little fuzzy and perhaps non-obvious for the 
current patches is the notification object.  I would expect it look 
something like:

typedef struct IOEvent {
   int type;
   void (*notify)(IOEvent *);
   void (*on_notify)(IOEvent *, void (*cb)(IOEvent *, void *));
} IOEvent;

And then we would have

typedef struct KVMIOEvent {
   IOEvent event = {.type = KVM_IO_EVENT};
   int fd;
} KVMIOEvent;

if (kvm_enabled()) in virtio-net, we would allocate a KVMIOEvent for the 
PIO notification and hand that to vhost.  vhost would check event.type 
and if it's KVM_IOEVENT, downcast and use that to get at the file 
descriptor.

The KVMIOEvent should be created, not in the set status callback, but in 
the PCI map callback.  And in the PCI map callback, cpu_single_env 
should be passed to a kvm specific function to create this KVMIOEvent 
object that contains the created eventfd() that's handed to kvm via ioctl.

It doesn't have to be exactly like this, but the point of all of this is 
that these KVM specific mechanism (which are really implementation 
details) should not be pervasive throughout the QEMU interfaces.  There 
should also be strong separation between the vhost-net code and the 
virtio-net device.

Regards,

Anthony Liguori


>   VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
>   {
>       VirtIONet *n;
> @@ -835,6 +871,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
>       n->vdev.set_features = virtio_net_set_features;
>       n->vdev.bad_features = virtio_net_bad_features;
>       n->vdev.reset = virtio_net_reset;
> +    n->vdev.set_status = virtio_net_set_status;
>       n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
>       n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx);
>       n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
> @@ -864,6 +901,9 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
>   void virtio_net_exit(VirtIODevice *vdev)
>   {
>       VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
> +    if (n->vhost_started) {
> +        vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
> +    }
>
>       qemu_purge_queued_packets(&n->nic->nc);
>
>    

  reply	other threads:[~2010-01-25 19:58 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1263230079.git.mst@redhat.com>
2010-01-11 17:16 ` [Qemu-devel] [PATCH-RFC 01/13] virtio: export virtqueue structure Michael S. Tsirkin
2010-01-12 22:32   ` [Qemu-devel] " Anthony Liguori
2010-01-25 19:10     ` Michael S. Tsirkin
2010-01-25 19:23       ` Anthony Liguori
2010-01-11 17:17 ` [Qemu-devel] [PATCH-RFC 02/13] kvm: add API to set ioeventfd Michael S. Tsirkin
2010-01-12 22:35   ` [Qemu-devel] " Anthony Liguori
2010-01-25 19:20     ` Michael S. Tsirkin
2010-01-25 19:28       ` Anthony Liguori
2010-01-25 19:28         ` Michael S. Tsirkin
2010-01-25 19:44           ` Anthony Liguori
2010-01-11 17:17 ` [Qemu-devel] [PATCH-RFC 03/13] virtio: add iofd/irqfd support Michael S. Tsirkin
2010-01-12 22:36   ` [Qemu-devel] " Anthony Liguori
2010-01-13 10:50     ` Michael S. Tsirkin
2010-01-11 17:17 ` [Qemu-devel] [PATCH-RFC 04/13] virtio-pci: fill in irqfd/queufd support Michael S. Tsirkin
2010-01-11 17:19 ` [Qemu-devel] [PATCH-RFC 05/13] syborg_virtio: add irqfd/eventfd support Michael S. Tsirkin
2010-01-11 17:20 ` [Qemu-devel] [PATCH-RFC 06/13] s390-virtio: fill in irqfd support Michael S. Tsirkin
2010-01-11 17:20 ` [Qemu-devel] [PATCH-RFC 07/13] virtio: move typedef to qemu-common Michael S. Tsirkin
2010-01-11 17:20 ` [Qemu-devel] [PATCH-RFC 08/13] net/tap: add interface to get device fd Michael S. Tsirkin
2010-01-11 17:22 ` [Qemu-devel] [PATCH-RFC 09/13] tap: add vhost/vhostfd options Michael S. Tsirkin
2010-01-12 22:39   ` [Qemu-devel] " Anthony Liguori
2010-01-13 10:59     ` Michael S. Tsirkin
2010-01-12 22:42   ` Anthony Liguori
2010-01-11 17:22 ` [Qemu-devel] [PATCH-RFC 10/13] tap: add API to retrieve vhost net header Michael S. Tsirkin
2010-01-11 17:23 ` [Qemu-devel] [PATCH-RFC 12/13] virtio: add status change callback Michael S. Tsirkin
2010-01-11 17:23 ` [Qemu-devel] [PATCH-RFC 13/13] virtio-net: connect to vhost net backend Michael S. Tsirkin
2010-01-25 19:58   ` Anthony Liguori [this message]
2010-01-25 20:27     ` [Qemu-devel] " Michael S. Tsirkin
2010-01-25 21:00       ` Anthony Liguori
2010-01-25 21:01         ` Michael S. Tsirkin
2010-01-25 21:05         ` Michael S. Tsirkin
2010-01-25 21:11           ` Anthony Liguori
2010-02-24  3:14         ` Paul Brook
2010-02-24  5:29           ` Michael S. Tsirkin
2010-02-24 11:30             ` Paul Brook
2010-02-24 11:46               ` Michael S. Tsirkin
2010-02-24 12:26                 ` Paul Brook
2010-02-24 12:40                   ` Michael S. Tsirkin
2010-02-24 15:16                 ` Anthony Liguori
2010-02-24 14:51           ` Anthony Liguori
2010-01-11 17:23 ` [Qemu-devel] [PATCH-RFC 11/13] vhost net support Michael S. Tsirkin
2010-01-12 22:45   ` [Qemu-devel] " Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B5DF7D0.7050609@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).