From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L5mK9-0006KS-I9 for qemu-devel@nongnu.org; Thu, 27 Nov 2008 14:14:49 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L5mK8-0006JJ-W5 for qemu-devel@nongnu.org; Thu, 27 Nov 2008 14:14:49 -0500 Received: from [199.232.76.173] (port=48879 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5mK8-0006JC-Or for qemu-devel@nongnu.org; Thu, 27 Nov 2008 14:14:48 -0500 Received: from fg-out-1718.google.com ([72.14.220.152]:10183) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L5mK8-0007Eb-19 for qemu-devel@nongnu.org; Thu, 27 Nov 2008 14:14:48 -0500 Received: by fg-out-1718.google.com with SMTP id l26so730150fgb.8 for ; Thu, 27 Nov 2008 11:14:45 -0800 (PST) Message-ID: Date: Thu, 27 Nov 2008 21:14:45 +0200 From: "Blue Swirl" Subject: Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1 In-Reply-To: <20081127123538.GC10348@random.random> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20081127123538.GC10348@random.random> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On 11/27/08, Andrea Arcangeli wrote: > Hello everyone, > > One major limitation for KVM today is the lack of a proper way to > write drivers in a way that allows the host OS to use direct DMA to > the guest physical memory to avoid any intermediate copy. The only API > provided to drivers seems to be the cpu_physical_memory_rw and that > enforces all drivers to bounce and trash cpu caches and be memory > bound. This new DMA API instead allows drivers to use a pci_dma_sg > method for SG I/O that will translate the guest physical addresses to > host virutal addresses and it will call two operation, one is a submit > method and one is the complete method. The pci_dma_sg may have to > bounce buffer internally and to limit the max bounce size it may have > to submit I/O in pieces with multiple submit calls. The patch adapts > the ide.c HD driver to use this. Once cdrom is converted too > dma_buf_rw can be eliminated. As you can see the new ide_dma_submit > and ide_dma_complete code is much more readable than the previous > rearming callback. > > This is only tested with KVM so far but qemu builds, in general > there's nothing kvm specific here (with the exception of a single > kvm_enabled), so it should all work well for both. > > All we care about is the performance of the direct path, so I tried to > avoid dynamic allocations there to avoid entering glibc, the current > logic doesn't satisfy me yet but it should be at least faster than > calling malloc (but I'm still working on it to avoid memory waste to > detect when more than one iov should be cached). But in case of > instabilities I recommend first thing to set MAX_IOVEC_IOVCNT 0 to > disable that logic ;). I recommend to test with DEBUG_BOUNCE and with > a 512 max bounce buffer too. It's running stable in all modes so > far. However if ide.c end up calling aio_cancel things will likely > fall apart but this is all because of bdrv_aio_readv/writev, and the > astonishing lack of aio_readv/writev in glibc! > > Once we finish fixing storage performance with a real > bdrv_aio_readv/writev (now a blocker issue), a pci_dma_single can be > added for zero copy networking (one NIC per VM, or VMDq, IOV > etc..). The DMA API should allow for that too. The previous similar attempt by Anthony for generic DMA using vectored IO was abandoned because the malloc/free overhead was more than the performance gain. Have you made any performance measurements? How does this version compare to the previous ones? I think the pci_ prefix can be removed, there is little PCI specific. For Sparc32 IOMMU (and probably other IOMMUS), it should be possible to register a function used in place of cpu_physical_memory_rw, c_p_m_can_dma etc. The goal is that it should be possible to stack the DMA resolvers (think of devices behind a number of buses).