From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L7Bmf-0007Ox-Kd for qemu-devel@nongnu.org; Mon, 01 Dec 2008 11:38:05 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L7Bme-0007Ol-5A for qemu-devel@nongnu.org; Mon, 01 Dec 2008 11:38:04 -0500 Received: from [199.232.76.173] (port=54587 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L7Bmd-0007Oi-Sb for qemu-devel@nongnu.org; Mon, 01 Dec 2008 11:38:03 -0500 Received: from yw-out-1718.google.com ([74.125.46.157]:52399) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L7Bmd-00038w-Gq for qemu-devel@nongnu.org; Mon, 01 Dec 2008 11:38:03 -0500 Received: by yw-out-1718.google.com with SMTP id 6so932445ywa.82 for ; Mon, 01 Dec 2008 08:38:02 -0800 (PST) Message-ID: <493412E6.9020308@codemonkey.ws> Date: Mon, 01 Dec 2008 10:37:58 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC 1/1] pci-dma-api-v2 References: <20081127123538.GC10348@random.random> <20081128015602.GA31011@random.random> <20081128185001.GD31011@random.random> <20081130174133.GC32172@random.random> <493318B0.7020506@codemonkey.ws> <4933B13C.6070205@redhat.com> In-Reply-To: <4933B13C.6070205@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Blue Swirl , Andrea Arcangeli Avi Kivity wrote: > Anthony Liguori wrote: >> >> I think you need something more sophisticated that can split up a >> scatter/gather list into a set of partial bounce buffers and partial >> direct copies. Just checking the vector once is a bit hackish. >> > > That code path would never be used or tested. As it is, bouncing will > only be invoked by dma to or from mmio, and I expect most guests never > to invoke it. Why would we complicate the code even more? I'm not sure it's strictly required but I think it would be a simplificaction. >> >> What should be fixed? Are these emulated functions wrong? >> >> There's a lack of symmetry here. We should have a bdrv_readv and >> bdrv_aio_readv. bdrv_read and bdrv_aio_read should disappear. We >> can maintain wrappers that create a compatible interface for older >> code but just adding a new API is wrong. >> > > It was understood a real aio readv/writev was being worked on, so the > emulation could be a temporary step. Yes, but that's orthogonal to what I'm saying here. I'm saying that instead of adding an optional ->aio_readv() member, we should eliminate the ->aio_read() member and replace it with ->aio_readv(). There are only three or four places in the code that implement aio so it's not a big change. It avoids introducing a new API too. I also think we should have complimentary synchronous vector functions although I'd like to see the synchronous API disappear completely. >>> +BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t >>> sector_num, >>> + struct iovec *iov, int iovcnt, size_t len, >>> + BlockDriverCompletionFunc *cb, void *opaque) >>> +{ >>> >> >> struct iovec does not exist on Windows. I also don't think you've >> got the abstraction right. Reading and writing may require actions >> to happen. I don't think you can just introduce something as simple >> as an iovec here. I think we need something more active. >> > > Can you elaborate? Actions happen in the completion callback. This > is just an straightforward extension of the block API, I don't think a > dma api should materially change the block api. If we're not going to try and fold in the IOMMU/PCI BUS API at this pass, then this is less important. But to implement a proper PCI bus API, I think there has to be function pointers associated with each element of the scatter/gather list that control how data is copied in and out of each element. >> >> I think you missed the mark here. This API needs to go through the >> PCIBus and be pluggable at that level. There can be a default >> implementation but it may differ for each PCI bus. >> > > I think this can serve as the default implementation. Perhaps when a > concrete user of pluggable dma emerges we can understand the > requirements better. I think if we drop the pci_ prefixes and just treat it as a basic DMA API as Blue Swirl suggested, then this is closer to what we need. Regards, Anthony Liguori