From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LBx7b-0005CZ-3m for qemu-devel@nongnu.org; Sun, 14 Dec 2008 14:59:23 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LBx7Z-00059p-Bo for qemu-devel@nongnu.org; Sun, 14 Dec 2008 14:59:22 -0500 Received: from [199.232.76.173] (port=42348 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LBx7Z-00059Z-7B for qemu-devel@nongnu.org; Sun, 14 Dec 2008 14:59:21 -0500 Received: from mx2.redhat.com ([66.187.237.31]:41835) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LBx7Y-0004TX-I1 for qemu-devel@nongnu.org; Sun, 14 Dec 2008 14:59:20 -0500 Message-ID: <494565A1.6030306@redhat.com> Date: Sun, 14 Dec 2008 21:59:29 +0200 From: Avi Kivity MIME-Version: 1.0 References: <4942B841.6010900@codemonkey.ws> <20081213143944.GD30537@random.random> <4943E6F9.1050001@codemonkey.ws> <20081213165306.GE30537@random.random> <4944251D.8080109@codemonkey.ws> <20081214164751.GF30537@random.random> <49453BF2.9070304@redhat.com> <20081214171558.GH30537@random.random> In-Reply-To: <20081214171558.GH30537@random.random> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andrea Arcangeli Cc: chrisw@redhat.com, kvm@vger.kernel.org, Gerd Hoffmann , qemu-devel@nongnu.org Andrea Arcangeli wrote: > On Sun, Dec 14, 2008 at 07:01:38PM +0200, Avi Kivity wrote: > >> Actually, with Xen, RAM may be unmapped due do Xen limitations when qemu >> runs on dom0 mode. But I think map/unmap makes sense even disregarding >> > > I realize xen 32bit has issues... Qemu/KVM 32bit also has the same > issues but there's no point in 2009 (that's when this stuff could go > productive) in trying to run guests with >2G of ram on a 32bit > host. The issues emerges (I guess with xen too) in trying to run those > obsolete hardware configurations. Even the atom and extremely low > power athlon have 64bit capability, and on embedded that runs a real > 32bit cpu I can't see how somebody would want to run a >2G guest. > kvm and Xen actually have different issues for 32-bit. For kvm, supporting >2G on 32-bits is possible but messy and pointless, so we chose not to do it. For Xen, this is a critical performance issue as 64-bit userspace in pv guests is quite slow. So dom0 runs as a 32-bit guest Newer Xen shouldn't have this problem though; it runs qemu in kernel mode in a dedicated 64-bit domain. >> Xen: if we add memory hotunplug, we need to make sure we don't unplug >> memory that has pending dma operations on it. map/unmap gives us the >> opportunity to refcount memory slots. >> > > So memory hotunplug here is considered differently than the real > memory hotplug emulation that simulates removing dimm on the > hardware. This is just the xen trick to handle >4G guest on a 32bit > address space? Well that's just the thing I'm not interested to > support. When 64bit wasn't mainstream it made some sense, these days > it's good enough if we can boot any guest OS (including 64bit ones) on > a 32bit build, but trying to run guests OS with >2G of ram doesn't > look useful. > Leaving Xen aside, memory hotunplug requires that we tell the memory when it's used and when it isn't. >> We can't get all dma to stop during hotunplug, since net rx operations are >> long-running (infinite if there is no activity on the link). >> >> IMO, we do want map/unmap, but this would be just a rename of can_dma and >> friends, and wouldn't have at this time any additional functionality. >> Bouncing has to happen where we have the ability to schedule the actual >> operation, and that's clearly not map/unmap. >> > > It would be a bit more than a rename, also keep in mind that in the > longer term as said we need to build the iovec in the exec.c path, > it's not enough to return a void *, I like to support a not 1:1 flat > space to avoid wasting host virtual address space with guest memory > holes. But that's about it, guest memory has to be always mapped, just > not with a 1:1 mapping, and surely not with a per-page array that > translates each page physical address to a host virtual address, but > with ranges. So this map thing that returns a 'void *' won't be there > for long even if I rename. > If it returns an iovec, still that doesn't change how it works. I like the symmetry of map()/unmap() and the lock/unlock semantics (like kmap_atomic/kunmap_atomic and a myriad other get/put pairs). [There's actually a language that supports this idiom, but that's a different flamewar] -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.