netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: xiaohui.xin@intel.com, netdev@vger.kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu,
	davem@davemloft.net, jdike@linux.intel.com
Subject: Re: [RFC][PATCH v3 1/3] A device for zero-copy based on KVM virtio-net.
Date: Wed, 14 Apr 2010 19:16:10 +0300	[thread overview]
Message-ID: <20100414161610.GA10897@redhat.com> (raw)
In-Reply-To: <201004141757.54829.arnd@arndb.de>

On Wed, Apr 14, 2010 at 05:57:54PM +0200, Arnd Bergmann wrote:
> On Wednesday 14 April 2010, Michael S. Tsirkin wrote:
> > On Wed, Apr 14, 2010 at 04:55:21PM +0200, Arnd Bergmann wrote:
> > > On Friday 09 April 2010, xiaohui.xin@intel.com wrote:
> > > > From: Xin Xiaohui <xiaohui.xin@intel.com>
> >
> > > It seems that you are duplicating a lot of functionality that
> > > is already in macvtap. I've asked about this before but then
> > > didn't look at your newer versions. Can you explain the value
> > > of introducing another interface to user land?
> > 
> > Hmm, I have not noticed a lot of duplication.
> 
> The code is indeed quite distinct, but the idea of adding another
> character device to pass into vhost for direct device access is.

All backends besides tap seem to do this, btw :)

> > BTW macvtap also duplicates tun code, it might be
> > a good idea for tun to export some functionality.
> 
> Yes, that's something I plan to look into.
> 
> > > I'm still planning to add zero-copy support to macvtap,
> > > hopefully reusing parts of your code, but do you think there
> > > is value in having both?
> > 
> > If macvtap would get zero copy tx and rx, maybe not. But
> > it's not immediately obvious whether zero-copy support
> > for macvtap might work, though, especially for zero copy rx.
> > The approach with mpassthru is much simpler in that
> > it takes complete control of the device.
> 
> As far as I can tell, the most significant limitation of mpassthru
> is that there can only ever be a single guest on a physical NIC.
> 
> Given that limitation, I believe we can do the same on macvtap,
> and simply disable zero-copy RX when you want to use more than one
> guest, or both guest and host on the same NIC.
> 
> The logical next step here would be to allow VMDq and similar
> technologies to separate out the RX traffic in the hardware.
> We don't have a configuration interface for that yet, but
> since this is logically the same as macvlan, I think we should
> use the same interfaces for both, essentially treating VMDq
> as a hardware acceleration for macvlan. We can probably handle
> it in similar ways to how we handle hardware support for vlan.
> 
> At that stage, macvtap would be the logical interface for
> connecting a VMDq (hardware macvlan) device to a guest!

I won't object to that but ... code walks.

> > > > +static ssize_t mp_chr_aio_write(struct kiocb *iocb, const struct iovec *iov,
> > > > +				unsigned long count, loff_t pos)
> > > > +{
> > > > +	struct file *file = iocb->ki_filp;
> > > > +	struct mp_struct *mp = mp_get(file->private_data);
> > > > +	struct sock *sk = mp->socket.sk;
> > > > +	struct sk_buff *skb;
> > > > +	int len, err;
> > > > +	ssize_t result;
> > > 
> > > Can you explain what this function is even there for? AFAICT, vhost-net
> > > doesn't call it, the interface is incompatible with the existing
> > > tap interface, and you don't provide a read function.
> > 
> > qemu needs the ability to inject raw packets into device
> > from userspace, bypassing vhost/virtio (for live migration).
> 
> Ok, but since there is only a write callback and no read, it won't
> actually be able to do this with the current code, right?

I think it'll work as is, with vhost qemu only ever writes,
never reads from device. We'll also never need GSO etc
which is a large part of what tap does (and macvtap will
have to do).

> Moreover, it seems weird to have a new type of interface here that
> duplicates tap/macvtap with less functionality. Coming back
> to your original comment, this means that while mpassthru is currently
> not duplicating the actual code from macvtap, it would need to do
> exactly that to get the qemu interface right!
> 
> 	Arnd

I don't think so, see above. anyway, both can reuse tun.c :)

-- 
MST

  reply	other threads:[~2010-04-14 16:16 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-09  9:37 [RFC][PATCH v3 0/3] Provide a zero-copy method on KVM virtio-net xiaohui.xin
2010-04-09  9:37 ` [RFC][PATCH v3 1/3] A device for zero-copy based " xiaohui.xin
2010-04-09  9:37   ` [RFC][PATCH v3 2/3] Provides multiple submits and asynchronous notifications xiaohui.xin
2010-04-09  9:37     ` [RFC][PATCH v3 3/3] Let host NIC driver to DMA to guest user space xiaohui.xin
2010-04-14 14:55   ` [RFC][PATCH v3 1/3] A device for zero-copy based on KVM virtio-net Arnd Bergmann
2010-04-14 15:26     ` Michael S. Tsirkin
2010-04-14 15:57       ` Arnd Bergmann
2010-04-14 16:16         ` Michael S. Tsirkin [this message]
2010-04-14 16:35           ` Arnd Bergmann
2010-04-14 20:31             ` Michael S. Tsirkin
2010-04-14 20:39               ` Arnd Bergmann
2010-04-14 20:40                 ` Michael S. Tsirkin
2010-04-14 20:52                   ` Arnd Bergmann
2010-04-15  9:01     ` Xin, Xiaohui
2010-04-15  9:03       ` Michael S. Tsirkin
2010-04-22  8:24         ` xiaohui.xin
2010-04-22  8:29           ` Xin, Xiaohui
2010-04-22  8:37         ` Re:[RFC][PATCH v3 2/3] Provides multiple submits and asynchronous notifications xiaohui.xin
2010-04-22  9:49           ` [RFC][PATCH " Michael S. Tsirkin
2010-04-23  7:08             ` xiaohui.xin
2010-04-24 19:32               ` [RFC][PATCH " Michael S. Tsirkin
2010-04-15 15:06       ` [RFC][PATCH v3 1/3] A device for zero-copy based on KVM virtio-net Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100414161610.GA10897@redhat.com \
    --to=mst@redhat.com \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=jdike@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=xiaohui.xin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).