From: David Ahern <dsahern@gmail.com>
To: Avi Kivity <avi@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
Alex Williamson <alex.williamson@redhat.com>,
KVM mailing list <kvm@vger.kernel.org>
Subject: Re: performance of virtual functions compared to virtio
Date: Wed, 27 Apr 2011 15:13:23 -0600 [thread overview]
Message-ID: <4DB886F3.10303@gmail.com> (raw)
In-Reply-To: <4DB67FFF.8010909@redhat.com>
On 04/26/11 02:19, Avi Kivity wrote:
> On 04/25/2011 08:49 PM, David Ahern wrote:
>> >
>> > There are several copies.
>> >
>> > qemu's virtio-net implementation incurs a copy on tx and on rx when
>> > calling the kernel; in addition there is also an internal copy:
>> >
>> > /* copy in packet. ugh */
>> > len = iov_from_buf(sg, elem.in_num,
>> > buf + offset, size - offset);
>> >
>> > In principle vhost-net can avoid the tx copy, but I think now we
>> have 1
>> > copy on rx and tx each.
>>
>> So there is a copy internal to qemu, then from qemu to the host tap
>> device and then tap device to a physical NIC if the packet is leaving
>> the host?
>
> There is no internal copy on tx, just rx.
>
> So:
>
> virtio-net: 1 internal rx, 1 kernel/user rx, 1 kernel/user tx
> vhost-net: 1 internal rx, 1 internal tx
Is the following depict where copies are done for virtio-net?
Packet Sends:
.==========================================.
| Host |
| |
| .-------------------------------. |
| | qemu-kvm process | |
| | | |
| | .-------------------------. | |
| | | Guest OS | | |
| | | --------- | | |
| | | ( netperf ) | | |
| | | --------- | | |
| | | user | | | |
| | |-------------------------| | |
| | | kernel | | | |
| | | .-----------. | | |
| | | | TCP stack | copy data from uspace to VM-based skb
| | | '-----------' | | |
| | | | | | |
| | | .--------. | | |
| | | | virtio | passes skb pointers to virtio device
| | | | (eth0) | | | |
| | '---------'--------'------' | |
| | | | |
| | .------------. | |
| | | virtio-net | convert buffer addresses from
| | | device | guest virtual to process (qemu)?
| | '------------' | |
| | | | |
| '-------------------------------' |
| | |
| userspace | |
|------------------------------------------|
| kernel | |
| .------. |
| | tap0 | data copied from userspace
| '------' to host kernel skbs
| | |
| .------. |
| | br | |
| '------' |
| | |
| .------. |
| | eth0 | skbs sent to device for xmit
'=========================================='
Packet Receives
.==========================================.
| Host |
| |
| .-------------------------------. |
| | qemu-kvm process | |
| | | |
| | .-------------------------. | |
| | | Guest OS | | |
| | | --------- | | |
| | | ( netperf ) | | |
| | | --------- | | |
| | | user | | | |
| | |-------------------------| | |
| | | kernel | data copied from skb to userspace buf
| | | .-----------. | | |
| | | | TCP stack | skb attached to socket
| | | '-----------' | | |
| | | | | | |
| | | .--------. | | |
| | | | virtio | put skb onto net queue
| | | | (eth0) | | | |
| | '---------'--------'------' | |
| | | copy here into devices' mapped skb?
| | | this is the extra "internal" copy?
| | .------------. | |
| | | virtio-net | data copied from host
| | | device | kernel to qemu process
| | '------------' | |
| | | | |
| '-------------------------------' |
| | |
| userspace | |
|------------------------------------------|
| kernel | |
| .------. |
| | tap0 | skbs attached to tap device
| '------' |
| | |
| .------. |
| | br | |
| '------' |
| | |
| .------. |
| | eth0 | device writes data into mapped skbs
'=========================================='
David
>
>> Is that what the zero-copy patch set is attempting - bypassing the
>> transmit copy to the macvtap device?
>
> Yes.
>
>> >
>> > If a host interface is dedicated to backing a vhost-net interface (say
>> > if you have an SR/IOV card) then you can in principle avoid the rx
>> copy
>> > as well.
>> >
>> > An alternative to avoiding the copies is to use a dma engine, like I
>> > mentioned.
>> >
>>
>> How does the DMA engine differ from the zero-copy patch set?
>
> The DMA engine does not avoid the copy, it merely uses a device other
> than the cpu to perform it. It offloads the cpu but still loads the
> interconnect. True zero-copy avoids both the cpu load and the
> interconnect load.
>
next prev parent reply other threads:[~2011-04-27 21:13 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-21 1:57 performance of virtual functions compared to virtio David Ahern
2011-04-21 2:35 ` Alex Williamson
2011-04-21 8:07 ` Avi Kivity
2011-04-21 12:31 ` Stefan Hajnoczi
2011-04-21 13:09 ` Avi Kivity
2011-04-25 17:49 ` David Ahern
2011-04-26 8:19 ` Avi Kivity
2011-04-27 21:13 ` David Ahern [this message]
2011-04-28 8:07 ` Avi Kivity
2011-04-25 17:46 ` David Ahern
2011-04-26 8:20 ` Avi Kivity
2011-04-25 17:39 ` David Ahern
2011-04-25 18:13 ` Alex Williamson
2011-04-25 19:07 ` David Ahern
2011-04-25 19:29 ` Alex Williamson
2011-04-25 19:49 ` David Ahern
2011-04-25 20:27 ` Alex Williamson
2011-04-25 20:40 ` David Ahern
2011-04-25 21:02 ` Alex Williamson
2011-04-25 21:14 ` David Ahern
2011-04-25 21:18 ` Alex Williamson
2011-04-25 20:49 ` Andrew Theurer
2011-05-02 18:58 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DB886F3.10303@gmail.com \
--to=dsahern@gmail.com \
--cc=alex.williamson@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).