virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* vhost questions.
@ 2013-03-13  6:49 Rusty Russell
  2013-03-13 15:59 ` Michael S. Tsirkin
  0 siblings, 1 reply; 7+ messages in thread
From: Rusty Russell @ 2013-03-13  6:49 UTC (permalink / raw)
  To: mst; +Cc: virtualization

OK, I've been trying to read the vhost and vhost net code, and I'm
struggling with basic questions.

1) Why do we allow userspace to change an already-set-up vhost device?
   Why not have:
        open()
        ioctl(VHOST_GET_FEATURES)
        ioctl(VHOST_SET_VRING) x n (sets num, addresses, kick/call/err fds)
        ioctl(VHOST_NET_SETUP)
        ...
        close()

   You're not allowed to call things twice: to reset, you close and
   reopen.  That would save a lot of code which I'm not sure is correct
   anyway.

2) Why do we implement write logging?  Shouldn't qemu just switch to
   emulation if it wants to track writes?

3) Why do we have our own workqueue implementation?

vhost/net.c questions:

1) Why do we wake the net tx worker every time the reference count of
   the number of outstanding DMAs is a multiple of 16?

2) Why is vhost_ubuf_ref kmalloced, rather than a structure member?

3) Why don't we simply use a slab allocator to allocate a structure for
   outstanding xmit dmas?  Wouldn't that be far simpler?  Would it be
   slower?

4) Why are the following fields in struct vhost_virtqueue, not struct
   vhost_net?

	/* hdr is used to store the virtio header.
	 * Since each iovec has >= 1 byte length, we never need more than
	 * header length entries to store the header. */
	struct iovec hdr[sizeof(struct virtio_net_hdr_mrg_rxbuf)];
	size_t vhost_hlen;
	size_t sock_hlen;
	struct vring_used_elem *heads;

        ...

	/* vhost zerocopy support fields below: */
	/* last used idx for outstanding DMA zerocopy buffers */
	int upend_idx;
	/* first used idx for DMA done zerocopy buffers */
	int done_idx;
	/* an array of userspace buffers info */
	struct ubuf_info *ubuf_info;
	/* Reference counting for outstanding ubufs.
	 * Protected by vq mutex. Writers must also take device mutex. */
	struct vhost_ubuf_ref *ubufs;

5) Why does vhost_net use the sendmsg interface for tun (and macvlan?)?
   It's not usable from userspace (it looks like if you can sendmsg to a
   tun device you can make it call an arbitrary pointer though).

   It would be better for us to create the skb, and have a special
   interface to inject it directly.  This way we can handle the
   UIO_MAXIOV limit ourselves (eg. by not doing zero copy on such stupid
   cases).

6) By calling tun_get_socket() and macvtap_get_socket() we are forcing
   those modules to load.  We should use symbol_get() to avoid this.

In short, this code is a mess.  Assuming we can't just break the API,
we've got a long process ahead of us to get it into shape :(

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-03-18  8:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-13  6:49 vhost questions Rusty Russell
2013-03-13 15:59 ` Michael S. Tsirkin
2013-03-14  6:38   ` Rusty Russell
2013-03-14  9:33     ` Michael S. Tsirkin
2013-03-18  0:50       ` Rusty Russell
2013-03-18  8:06         ` Michael S. Tsirkin
2013-03-17 11:09     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).