* AlacrityVM benchmark numbers updated
@ 2009-08-26 1:01 Gregory Haskins
2009-08-26 10:16 ` Avi Kivity
0 siblings, 1 reply; 5+ messages in thread
From: Gregory Haskins @ 2009-08-26 1:01 UTC (permalink / raw)
To: alacrityvm-devel
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Michael S. Tsirkin, netdev
[-- Attachment #1: Type: text/plain, Size: 862 bytes --]
We are pleased to announce the availability of the latest networking
benchmark numbers for AlacrityVM. We've made several tweaks to the
original v0.1 release to improve performance. The most notable is a
switch from get_user_pages to switch_mm+copy_[to/from]_user thanks to a
review suggestion from Michael Tsirkin (as well as his patch to
implement it).
This change alone accounted for freeing up an additional 1.2Gbps, which
is over 25% improvement from v0.1. The previous numbers were 4560Gbps
before the change, and 5708Gbps after (for 1500mtu over 10GE). This
moves us ever closer to the goal of native performance under virtualization.
These changes will be incorporated into the upcoming v0.2 release. For
more details, please visit our project wiki:
http://developer.novell.com/wiki/index.php/AlacrityVM
Kind Regards,
-Greg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: AlacrityVM benchmark numbers updated
2009-08-26 1:01 AlacrityVM benchmark numbers updated Gregory Haskins
@ 2009-08-26 10:16 ` Avi Kivity
2009-08-26 18:42 ` Gregory Haskins
0 siblings, 1 reply; 5+ messages in thread
From: Avi Kivity @ 2009-08-26 10:16 UTC (permalink / raw)
To: Gregory Haskins
Cc: alacrityvm-devel, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Michael S. Tsirkin, netdev
On 08/26/2009 04:01 AM, Gregory Haskins wrote:
> We are pleased to announce the availability of the latest networking
> benchmark numbers for AlacrityVM. We've made several tweaks to the
> original v0.1 release to improve performance. The most notable is a
> switch from get_user_pages to switch_mm+copy_[to/from]_user thanks to a
> review suggestion from Michael Tsirkin (as well as his patch to
> implement it).
>
> This change alone accounted for freeing up an additional 1.2Gbps, which
> is over 25% improvement from v0.1. The previous numbers were 4560Gbps
> before the change, and 5708Gbps after (for 1500mtu over 10GE). This
> moves us ever closer to the goal of native performance under virtualization.
>
Interesting, it's good to see that copy_*_user() works so well. Note
that there's a possible optimization that goes in the opposite direction
- keep using get_user_pages(), but use the dma engine API to perform the
actual copy. I expect that it will only be a win when using tso to
transfer full pages. Large pages may also help.
Copyless tx also wants get_user_pages(). It makes sense to check if
switch_mm() + get_user_pages_fast() gives better performance than
get_user_pages().
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: AlacrityVM benchmark numbers updated
2009-08-26 10:16 ` Avi Kivity
@ 2009-08-26 18:42 ` Gregory Haskins
2009-08-26 19:23 ` Avi Kivity
0 siblings, 1 reply; 5+ messages in thread
From: Gregory Haskins @ 2009-08-26 18:42 UTC (permalink / raw)
To: Avi Kivity
Cc: alacrityvm-devel, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Michael S. Tsirkin, netdev
[-- Attachment #1: Type: text/plain, Size: 1633 bytes --]
Avi Kivity wrote:
> On 08/26/2009 04:01 AM, Gregory Haskins wrote:
>> We are pleased to announce the availability of the latest networking
>> benchmark numbers for AlacrityVM. We've made several tweaks to the
>> original v0.1 release to improve performance. The most notable is a
>> switch from get_user_pages to switch_mm+copy_[to/from]_user thanks to a
>> review suggestion from Michael Tsirkin (as well as his patch to
>> implement it).
>>
>> This change alone accounted for freeing up an additional 1.2Gbps, which
>> is over 25% improvement from v0.1. The previous numbers were 4560Gbps
>> before the change, and 5708Gbps after (for 1500mtu over 10GE). This
>> moves us ever closer to the goal of native performance under
>> virtualization.
>>
>
> Interesting, it's good to see that copy_*_user() works so well. Note
> that there's a possible optimization that goes in the opposite direction
> - keep using get_user_pages(), but use the dma engine API to perform the
> actual copy. I expect that it will only be a win when using tso to
> transfer full pages. Large pages may also help.
>
> Copyless tx also wants get_user_pages(). It makes sense to check if
> switch_mm() + get_user_pages_fast() gives better performance than
> get_user_pages().
Actually, I have already look at this and it does indeed seem better to
use switch_mm+gupf() over gup() by quite a large margin. You could then
couple that with your DMA-engine idea to potentially gain even more
benefits (though probably not for networking since most NICs have their
own DMA engine anyway).
Kind Regards,
-Greg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: AlacrityVM benchmark numbers updated
2009-08-26 18:42 ` Gregory Haskins
@ 2009-08-26 19:23 ` Avi Kivity
2009-08-26 20:05 ` Gregory Haskins
0 siblings, 1 reply; 5+ messages in thread
From: Avi Kivity @ 2009-08-26 19:23 UTC (permalink / raw)
To: Gregory Haskins
Cc: alacrityvm-devel, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Michael S. Tsirkin, netdev
On 08/26/2009 09:42 PM, Gregory Haskins wrote:
> Actually, I have already look at this and it does indeed seem better to
> use switch_mm+gupf() over gup() by quite a large margin. You could then
> couple that with your DMA-engine idea to potentially gain even more
> benefits (though probably not for networking since most NICs have their
> own DMA engine anyway).
>
>
For tx, we'll just go copyless once we plumb the destructors properly.
But for rx on a shared interface it is impossible to avoid the copy.
You can only choose if you want it done by the cpu or a local dma engine.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: AlacrityVM benchmark numbers updated
2009-08-26 19:23 ` Avi Kivity
@ 2009-08-26 20:05 ` Gregory Haskins
0 siblings, 0 replies; 5+ messages in thread
From: Gregory Haskins @ 2009-08-26 20:05 UTC (permalink / raw)
To: Avi Kivity
Cc: alacrityvm-devel, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Michael S. Tsirkin, netdev
[-- Attachment #1: Type: text/plain, Size: 688 bytes --]
Avi Kivity wrote:
> On 08/26/2009 09:42 PM, Gregory Haskins wrote:
>> Actually, I have already look at this and it does indeed seem better to
>> use switch_mm+gupf() over gup() by quite a large margin. You could then
>> couple that with your DMA-engine idea to potentially gain even more
>> benefits (though probably not for networking since most NICs have their
>> own DMA engine anyway).
>>
>>
>
> For tx, we'll just go copyless once we plumb the destructors properly.
> But for rx on a shared interface it is impossible to avoid the copy.
> You can only choose if you want it done by the cpu or a local dma engine.
>
>
Yep, agree on both counts.
-Greg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-08-26 20:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-26 1:01 AlacrityVM benchmark numbers updated Gregory Haskins
2009-08-26 10:16 ` Avi Kivity
2009-08-26 18:42 ` Gregory Haskins
2009-08-26 19:23 ` Avi Kivity
2009-08-26 20:05 ` Gregory Haskins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).