From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L6q6r-0005AC-IN for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:29:29 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L6q6p-00059j-LE for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:29:28 -0500 Received: from [199.232.76.173] (port=60891 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L6q6p-00059X-F1 for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:29:27 -0500 Received: from mx2.redhat.com ([66.187.237.31]:49484) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L6q6p-0002Ud-2Q for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:29:27 -0500 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id mAUHTQqQ018679 for ; Sun, 30 Nov 2008 12:29:26 -0500 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id mAUHTP7d017911 for ; Sun, 30 Nov 2008 12:29:26 -0500 Received: from random.random (vpn-10-12.str.redhat.com [10.32.10.12]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id mAUHTPNv021582 for ; Sun, 30 Nov 2008 12:29:25 -0500 Date: Sun, 30 Nov 2008 18:29:24 +0100 From: Andrea Arcangeli Subject: Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1 Message-ID: <20081130172924.GB32172@random.random> References: <20081127123538.GC10348@random.random> <49319C86.8050408@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49319C86.8050408@redhat.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On Sat, Nov 29, 2008 at 09:48:22PM +0200, Avi Kivity wrote: > Since Andrea's patches contain emulation for aio readv/writev, the > performance degradation will not occur (though we will not see the benefit > either). It will not occur especially if you #define DEBUG_BOUNCE. The way I emulated bdrv_aio_readv/writev didn't involve a bounce, but it's serially submitting the I/O so that it truly runs zerocopy when DEBUG_BOUNCE is not defined ;). IF you enable DEBUG_BOUNCE then the bounce layer will be forced on, and the iovec will be linearized and the DMA command executed on the hardware will be allowed to be as large as MAX_DMA_BOUNCE_BUFFER like before. So until we have a real bdrv_aio_readv/writev #defining DEBUG_BOUNCE is a must with cache=off. > I doubt you can get measure malloc overhead with anything less a thousand > disks (even there, other overheads are likely to drown that malloc). I also eliminated any sign of malloc in the direct path with a small cache layer that caches as many pci dma sg params (with the max iovcnt seen so far embedded into it) as the max number of simultaneous in-flight I/O seen for the whole runtime. With a max param of 10 (so only if there are more than 10 simultaneous sg dma I/O operations in flight, malloc will have to run). Only params with iov arrays with a iovcnt < 2048 are cached. So worst case ram waste is limited, and it's auto-tuning at runtime to remain near zero for single disk setups etc..