From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Ahern Subject: Re: performance of virtual functions compared to virtio Date: Wed, 27 Apr 2011 15:13:23 -0600 Message-ID: <4DB886F3.10303@gmail.com> References: <4DAF8EF0.8010203@gmail.com> <1303353349.3110.181.camel@x201> <4DAFE5BE.1070506@redhat.com> <4DB02C9F.2050901@redhat.com> <4DB5B436.4060000@gmail.com> <4DB67FFF.8010909@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Stefan Hajnoczi , Alex Williamson , KVM mailing list To: Avi Kivity Return-path: Received: from mail-pv0-f174.google.com ([74.125.83.174]:42930 "EHLO mail-pv0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751049Ab1D0VN2 (ORCPT ); Wed, 27 Apr 2011 17:13:28 -0400 Received: by pvg12 with SMTP id 12so1405183pvg.19 for ; Wed, 27 Apr 2011 14:13:28 -0700 (PDT) In-Reply-To: <4DB67FFF.8010909@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 04/26/11 02:19, Avi Kivity wrote: > On 04/25/2011 08:49 PM, David Ahern wrote: >> > >> > There are several copies. >> > >> > qemu's virtio-net implementation incurs a copy on tx and on rx when >> > calling the kernel; in addition there is also an internal copy: >> > >> > /* copy in packet. ugh */ >> > len = iov_from_buf(sg, elem.in_num, >> > buf + offset, size - offset); >> > >> > In principle vhost-net can avoid the tx copy, but I think now we >> have 1 >> > copy on rx and tx each. >> >> So there is a copy internal to qemu, then from qemu to the host tap >> device and then tap device to a physical NIC if the packet is leaving >> the host? > > There is no internal copy on tx, just rx. > > So: > > virtio-net: 1 internal rx, 1 kernel/user rx, 1 kernel/user tx > vhost-net: 1 internal rx, 1 internal tx Is the following depict where copies are done for virtio-net? Packet Sends: .==========================================. | Host | | | | .-------------------------------. | | | qemu-kvm process | | | | | | | | .-------------------------. | | | | | Guest OS | | | | | | --------- | | | | | | ( netperf ) | | | | | | --------- | | | | | | user | | | | | | |-------------------------| | | | | | kernel | | | | | | | .-----------. | | | | | | | TCP stack | copy data from uspace to VM-based skb | | | '-----------' | | | | | | | | | | | | | .--------. | | | | | | | virtio | passes skb pointers to virtio device | | | | (eth0) | | | | | | '---------'--------'------' | | | | | | | | | .------------. | | | | | virtio-net | convert buffer addresses from | | | device | guest virtual to process (qemu)? | | '------------' | | | | | | | | '-------------------------------' | | | | | userspace | | |------------------------------------------| | kernel | | | .------. | | | tap0 | data copied from userspace | '------' to host kernel skbs | | | | .------. | | | br | | | '------' | | | | | .------. | | | eth0 | skbs sent to device for xmit '==========================================' Packet Receives .==========================================. | Host | | | | .-------------------------------. | | | qemu-kvm process | | | | | | | | .-------------------------. | | | | | Guest OS | | | | | | --------- | | | | | | ( netperf ) | | | | | | --------- | | | | | | user | | | | | | |-------------------------| | | | | | kernel | data copied from skb to userspace buf | | | .-----------. | | | | | | | TCP stack | skb attached to socket | | | '-----------' | | | | | | | | | | | | | .--------. | | | | | | | virtio | put skb onto net queue | | | | (eth0) | | | | | | '---------'--------'------' | | | | | copy here into devices' mapped skb? | | | this is the extra "internal" copy? | | .------------. | | | | | virtio-net | data copied from host | | | device | kernel to qemu process | | '------------' | | | | | | | | '-------------------------------' | | | | | userspace | | |------------------------------------------| | kernel | | | .------. | | | tap0 | skbs attached to tap device | '------' | | | | | .------. | | | br | | | '------' | | | | | .------. | | | eth0 | device writes data into mapped skbs '==========================================' David > >> Is that what the zero-copy patch set is attempting - bypassing the >> transmit copy to the macvtap device? > > Yes. > >> > >> > If a host interface is dedicated to backing a vhost-net interface (say >> > if you have an SR/IOV card) then you can in principle avoid the rx >> copy >> > as well. >> > >> > An alternative to avoiding the copies is to use a dma engine, like I >> > mentioned. >> > >> >> How does the DMA engine differ from the zero-copy patch set? > > The DMA engine does not avoid the copy, it merely uses a device other > than the cpu to perform it. It offloads the cpu but still loads the > interconnect. True zero-copy avoids both the cpu load and the > interconnect load. >