* Re: copyless virtio net thoughts? [not found] ` <498ADD73.3060906@redhat.com> @ 2009-02-06 5:40 ` Herbert Xu 2009-02-06 8:46 ` Avi Kivity 0 siblings, 1 reply; 6+ messages in thread From: Herbert Xu @ 2009-02-06 5:40 UTC (permalink / raw) To: Avi Kivity; +Cc: Chris Wright, Arnd Bergmann, Rusty Russell, kvm, netdev On Thu, Feb 05, 2009 at 02:37:07PM +0200, Avi Kivity wrote: > > I believe that copyless networking is absolutely essential. I used to think it was important, but I'm now of the opinion that it's quite useless for virtualisation as it stands. > For transmit, copyless is needed to properly support sendfile() type > workloads - http/ftp/nfs serving. These are usually high-bandwidth, > cache-cold workloads where a copy is most expensive. This is totally true for baremetal, but useless for virtualisation right now because the block layer is not zero-copy. That is, the data is going to be cache hot anyway so zero-copy networking doesn't buy you much at all. Please also recall that for the time being, block speeds are way slower than network speeds. So the really interesting case is actually network-to-network transfers. Again due to the RX copy this is going to be cache hot. > For receive, the guest will almost always do an additional copy, but it > will most likely do the copy from another cpu. Xen netchannel2 That's what we should strive to avoid. The best scenario with modern 10GbE NICs is to stay on one CPU if at all possible. The NIC will pick a CPU when it delivers the packet into one of the RX queues and we should stick with it for as long as possible. So what I'd like to see next in virtualised networking is virtual multiqueue support in guest drivers. No I'm not talking about making one or more of the physical RX/TX queues available to the guest (aka passthrough), but actually turning something like the virtio-net interface into a multiqueue interface. This is the best way to get cache locality and minimise CPU waste. So I'm certainly not rushing out to do any zero-copy virtual networking. However, I would like to start working on a virtual multiqueue NIC interface. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copyless virtio net thoughts? 2009-02-06 5:40 ` copyless virtio net thoughts? Herbert Xu @ 2009-02-06 8:46 ` Avi Kivity 2009-02-06 9:19 ` Herbert Xu 0 siblings, 1 reply; 6+ messages in thread From: Avi Kivity @ 2009-02-06 8:46 UTC (permalink / raw) To: Herbert Xu; +Cc: Chris Wright, Arnd Bergmann, Rusty Russell, kvm, netdev Herbert Xu wrote: > On Thu, Feb 05, 2009 at 02:37:07PM +0200, Avi Kivity wrote: > >> I believe that copyless networking is absolutely essential. >> > > I used to think it was important, but I'm now of the opinion > that it's quite useless for virtualisation as it stands. > > >> For transmit, copyless is needed to properly support sendfile() type >> workloads - http/ftp/nfs serving. These are usually high-bandwidth, >> cache-cold workloads where a copy is most expensive. >> > > This is totally true for baremetal, but useless for virtualisation > right now because the block layer is not zero-copy. That is, the > data is going to be cache hot anyway so zero-copy networking doesn't > buy you much at all. > The guest's block layer is copyless. The host block layer is -><- this far from being copyless -- all we need is preadv()/pwritev() or to replace our thread pool implementation in qemu with linux-aio. Everything else is copyless. Since we are actively working on this, expect this limitation to disappear soon. (even if it doesn't, the effect of block layer copies is multiplied by the cache miss percentage which can be quite low for many workloads; but again, we're not bulding on that) > Please also recall that for the time being, block speeds are > way slower than network speeds. So the really interesting case > is actually network-to-network transfers. Again due to the > RX copy this is going to be cache hot. > Block speeds are not way slower. We're at 4Gb/sec for Fibre and 10Gb/s for networking. With dual channels or a decent cache hit rate they're evenly matched. >> For receive, the guest will almost always do an additional copy, but it >> will most likely do the copy from another cpu. Xen netchannel2 >> > > That's what we should strive to avoid. The best scenario with > modern 10GbE NICs is to stay on one CPU if at all possible. The > NIC will pick a CPU when it delivers the packet into one of the > RX queues and we should stick with it for as long as possible. > > So what I'd like to see next in virtualised networking is virtual > multiqueue support in guest drivers. No I'm not talking about > making one or more of the physical RX/TX queues available to the > guest (aka passthrough), but actually turning something like the > virtio-net interface into a multiqueue interface. > I support this, but it should be in addition to copylessness, not on its own. - many guests will not support multiqueue - for some threaded workloads, you cannot predict where the final read() will come from; this renders multiqueue ineffective for keeping cache locality - usually you want virtio to transfer large amounts of data; but if you want your copies to be cache-hot, you need to limit transfers to half the cache size (a quarter if hyperthreading); this limits virtio effectiveness -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copyless virtio net thoughts? 2009-02-06 8:46 ` Avi Kivity @ 2009-02-06 9:19 ` Herbert Xu 2009-02-06 14:55 ` Avi Kivity 0 siblings, 1 reply; 6+ messages in thread From: Herbert Xu @ 2009-02-06 9:19 UTC (permalink / raw) To: Avi Kivity; +Cc: Chris Wright, Arnd Bergmann, Rusty Russell, kvm, netdev On Fri, Feb 06, 2009 at 10:46:37AM +0200, Avi Kivity wrote: > > The guest's block layer is copyless. The host block layer is -><- this > far from being copyless -- all we need is preadv()/pwritev() or to > replace our thread pool implementation in qemu with linux-aio. > Everything else is copyless. > > Since we are actively working on this, expect this limitation to > disappear soon. Great, when that happens I'll promise to revisit zero-copy transmit :) > I support this, but it should be in addition to copylessness, not on its > own. I was talking about it in the context of zero-copy receive, where you mentioned that the virtio/kvm copy may not occur on the CPU of the guest's copy. My point is that using multiqueue you can avoid this change of CPU. But yeah I think zero-copy receive is much more useful than zero- copy transmit at the moment. Although I'd prefer to wait for you guys to finish the block layer work before contemplating pushing the copy on receive into the guest :) > - many guests will not support multiqueue Well, these guests will suck both on baremetal and in virtualisation, big deal :) Multiqueue at 10GbE speeds and above is simply not an optional feature. > - for some threaded workloads, you cannot predict where the final read() > will come from; this renders multiqueue ineffective for keeping cache > locality > > - usually you want virtio to transfer large amounts of data; but if you > want your copies to be cache-hot, you need to limit transfers to half > the cache size (a quarter if hyperthreading); this limits virtio > effectiveness Agreed on both counts. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copyless virtio net thoughts? 2009-02-06 9:19 ` Herbert Xu @ 2009-02-06 14:55 ` Avi Kivity 2009-02-07 11:56 ` Arnd Bergmann 0 siblings, 1 reply; 6+ messages in thread From: Avi Kivity @ 2009-02-06 14:55 UTC (permalink / raw) To: Herbert Xu; +Cc: Chris Wright, Arnd Bergmann, Rusty Russell, kvm, netdev Herbert Xu wrote: > On Fri, Feb 06, 2009 at 10:46:37AM +0200, Avi Kivity wrote: > >> The guest's block layer is copyless. The host block layer is -><- this >> far from being copyless -- all we need is preadv()/pwritev() or to >> replace our thread pool implementation in qemu with linux-aio. >> Everything else is copyless. >> >> Since we are actively working on this, expect this limitation to >> disappear soon. >> > > Great, when that happens I'll promise to revisit zero-copy transmit :) > > I was hoping to get some concurrency here, but okay. >> I support this, but it should be in addition to copylessness, not on its >> own. >> > > I was talking about it in the context of zero-copy receive, where > you mentioned that the virtio/kvm copy may not occur on the CPU of > the guest's copy. > > My point is that using multiqueue you can avoid this change of CPU. > > But yeah I think zero-copy receive is much more useful than zero- > copy transmit at the moment. Although I'd prefer to wait for > you guys to finish the block layer work before contemplating > pushing the copy on receive into the guest :) > > We'll get the block layer done soon, so it won't be a barrier. >> - many guests will not support multiqueue >> > > Well, these guests will suck both on baremetal and in virtualisation, > big deal :) Multiqueue at 10GbE speeds and above is simply not an > optional feature. > Each guest may only use a part of the 10Gb/s bandwidth, if you have 10 guests each using 1Gb/s, then we should be able to support this without multiqueue in the guests. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copyless virtio net thoughts? 2009-02-06 14:55 ` Avi Kivity @ 2009-02-07 11:56 ` Arnd Bergmann 2009-02-08 3:01 ` David Miller 0 siblings, 1 reply; 6+ messages in thread From: Arnd Bergmann @ 2009-02-07 11:56 UTC (permalink / raw) To: Avi Kivity; +Cc: Herbert Xu, Chris Wright, Rusty Russell, kvm, netdev On Friday 06 February 2009, Avi Kivity wrote: > > Well, these guests will suck both on baremetal and in virtualisation, > > big deal :) Multiqueue at 10GbE speeds and above is simply not an > > optional feature. > > > > Each guest may only use a part of the 10Gb/s bandwidth, if you have 10 > guests each using 1Gb/s, then we should be able to support this without > multiqueue in the guests. I would expect that there are people that even people with 10 simultaneous guests would like to be able to saturate the link when only one or two of them are doing much traffic on the interface. Having the load spread evenly over all guests sounds like a much rarer use case. Arnd <>< ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copyless virtio net thoughts? 2009-02-07 11:56 ` Arnd Bergmann @ 2009-02-08 3:01 ` David Miller 0 siblings, 0 replies; 6+ messages in thread From: David Miller @ 2009-02-08 3:01 UTC (permalink / raw) To: arnd; +Cc: avi, herbert, chrisw, rusty, kvm, netdev From: Arnd Bergmann <arnd@arndb.de> Date: Sat, 7 Feb 2009 12:56:06 +0100 > Having the load spread evenly over all guests sounds like a much rarer > use case. Totally agreed. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-02-08 3:01 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20090205020732.GA27684@sequoia.sous-sol.org> [not found] ` <498ADD73.3060906@redhat.com> 2009-02-06 5:40 ` copyless virtio net thoughts? Herbert Xu 2009-02-06 8:46 ` Avi Kivity 2009-02-06 9:19 ` Herbert Xu 2009-02-06 14:55 ` Avi Kivity 2009-02-07 11:56 ` Arnd Bergmann 2009-02-08 3:01 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).