From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LBu8Q-0000PN-LL for qemu-devel@nongnu.org; Sun, 14 Dec 2008 11:48:02 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LBu8P-0000OE-8o for qemu-devel@nongnu.org; Sun, 14 Dec 2008 11:48:02 -0500 Received: from [199.232.76.173] (port=35122 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LBu8P-0000O9-09 for qemu-devel@nongnu.org; Sun, 14 Dec 2008 11:48:01 -0500 Received: from mx2.redhat.com ([66.187.237.31]:40682) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LBu8O-0003iW-Fs for qemu-devel@nongnu.org; Sun, 14 Dec 2008 11:48:00 -0500 Date: Sun, 14 Dec 2008 17:47:52 +0100 From: Andrea Arcangeli Message-ID: <20081214164751.GF30537@random.random> References: <4942B841.6010900@codemonkey.ws> <20081213143944.GD30537@random.random> <4943E6F9.1050001@codemonkey.ws> <20081213165306.GE30537@random.random> <4944251D.8080109@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4944251D.8080109@codemonkey.ws> Subject: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: chrisw@redhat.com, avi@redhat.com, Gerd Hoffmann , kvm@vger.kernel.org, qemu-devel@nongnu.org On Sat, Dec 13, 2008 at 03:11:57PM -0600, Anthony Liguori wrote: > cause an overflow. You will naturally validate this in the map() function > because you cannot map something that is greater than can fit in a void *. When you told me to pass ram_addr_t instead of size_t in my patch, I didn't mean it was just for validating that callers would comply with the clear dma interface. With my patch I was going to truly support dma operations larger than 4G on 32bit host and 64bit guest, but only with mmio regions as destination, and with a max overhead of the max-bounce-size of 1M. To me map/unmap looks backwards. There's absolutely no point at all to pretend that RAM isn't always mapped. Furthermore bouncing inside that layer (at least with the api that you're proposing that can't handle partial I/O and restart) is obviously broken design. Once memory hotplug will emerge we've just to add a read write lock before invoking can-dma/post-dma and stuff. There's no reason to ever call anything after a read dma completed. After my stuff would work, my next step would be to get rid entirely of that per-page array that translates a ram_addr_t to a virtual address and replace it with a rbtree of linear ranges, and then the iovec would need to be passed down to exec.c so that it would be filled with direct dma even if the whole range isn't linear. And in average a single lookup of the tree would return us immediate information. I'm ok to support a not entirely flat ram space, but pretending to support an API that requires to mangle host ptes (and sptes on kvm case) every time there's a dma is entirely overkill and backwards, besides preventing you to bounce sanely if you go over mmio regions and preventing you as well to dma >4G space of mmio on 32bit build with 64bit guest. The whole concept of having to map something is flawed, there's nothing to map. At most you've to take a read lock to prevent future memory hotplug to change the memory layout from under you, but the concept of mapping has nothing to do with that. RAM is always mapped, and mmio has to be emulated anyway so it's worthless to map it.