public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>, Arnd Bergmann <arnd@arndb.de>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Rusty Russell <rusty@rustcorp.com.au>,
	kvm@vger.kernel.org
Subject: Re: copyless virtio net thoughts?
Date: Thu, 05 Feb 2009 08:25:14 -0600	[thread overview]
Message-ID: <498AF6CA.50101@codemonkey.ws> (raw)
In-Reply-To: <498ADD73.3060906@redhat.com>

Avi Kivity wrote:
> Chris Wright wrote:
>> There's been a number of different discussions re: getting copyless 
>> virtio
>> net (esp. for KVM).  This is just a poke in that general direction to
>> stir the discussion.  I'm interested to hear current thoughts
>
> I believe that copyless networking is absolutely essential.
>
> For transmit, copyless is needed to properly support sendfile() type 
> workloads - http/ftp/nfs serving.  These are usually high-bandwidth, 
> cache-cold workloads where a copy is most expensive.
>
> For receive, the guest will almost always do an additional copy, but 
> it will most likely do the copy from another cpu.  Xen netchannel2 
> mitigates this somewhat by having the guest request the hypervisor to 
> perform the copy when the rx interrupt is processed, but this may 
> still be too early (the packet may be destined to a process that is on 
> another vcpu), and the extra hypercall is expensive.
>
> In my opinion, it would be ideal to linux-aio enable taps and packet 
> sockets.  io_submit() allows submitting multiple buffers in one 
> syscall and supports scatter/gather.  io_getevents() supports 
> dequeuing multiple packet completions in one syscall.

splice() has some nice properties too.  It disconnects the notion of 
moving around packets from the actually copy them.   It also fits well 
into a more performant model of interguest IO.  You can't publish 
multiple buffers with splice but I don't think we can do that today 
practically speaking because of mergable RX buffers.  You would have to 
extend the linux-aio interface to hand it a bunch of buffers and for it 
to tell you where the packet boundaries were.

Regards,

Anthony Liguroi



  reply	other threads:[~2009-02-05 14:25 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-05  2:07 copyless virtio net thoughts? Chris Wright
2009-02-05 12:37 ` Avi Kivity
2009-02-05 14:25   ` Anthony Liguori [this message]
2009-02-06  5:40   ` Herbert Xu
2009-02-06  8:46     ` Avi Kivity
2009-02-06  9:19       ` Herbert Xu
2009-02-06 14:55         ` Avi Kivity
2009-02-07 11:56           ` Arnd Bergmann
2009-02-08  3:01             ` David Miller
2009-02-18 11:38 ` Rusty Russell
2009-02-18 12:17   ` Herbert Xu
2009-02-18 16:24   ` Arnd Bergmann
2009-02-19 10:56     ` Rusty Russell
2009-02-18 23:31   ` Simon Horman
2009-02-19  1:03     ` Dong, Eddie
2009-02-19 11:36     ` Rusty Russell
2009-02-19 14:51       ` Arnd Bergmann
2009-02-19 23:09       ` Simon Horman
2009-02-19 11:37     ` Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=498AF6CA.50101@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=kvm@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox