From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LOxvC-0005g9-O2 for qemu-devel@nongnu.org; Mon, 19 Jan 2009 12:28:22 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LOxvB-0005ee-Vi for qemu-devel@nongnu.org; Mon, 19 Jan 2009 12:28:22 -0500 Received: from [199.232.76.173] (port=53540 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LOxvB-0005e5-Ev for qemu-devel@nongnu.org; Mon, 19 Jan 2009 12:28:21 -0500 Received: from mx2.redhat.com ([66.187.237.31]:56896) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LOxvA-0002Z8-K7 for qemu-devel@nongnu.org; Mon, 19 Jan 2009 12:28:20 -0500 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n0JHSHQ5011475 for ; Mon, 19 Jan 2009 12:28:17 -0500 Message-ID: <4974B82F.9020805@redhat.com> Date: Mon, 19 Jan 2009 19:28:15 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API References: <1232308399-21679-1-git-send-email-avi@redhat.com> <1232308399-21679-2-git-send-email-avi@redhat.com> <18804.34053.211615.181730@mariner.uk.xensource.com> <4974943B.4020507@redhat.com> <18804.44271.868488.32192@mariner.uk.xensource.com> In-Reply-To: <18804.44271.868488.32192@mariner.uk.xensource.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Ian Jackson wrote: >> Correct. If you need to perform read-modify-write, you need to use >> cpu_physical_memory_rw(), twice. If we ever want to support RMW, we'll >> need to add another value for is_write. I don't think we have >> interesting devices at this point which require efficient RMW. >> > > Efficient read-modify-write may be very hard for some setups to > achieve. It can't be done with the bounce buffer implementation. > I think ond good rule of thumb would be to make sure that the interface > as specified can be implemented in terms of cpu_physical_memory_rw. > What is the motivation for efficient rmw? >> Alternatively, only use this interface with devices where this doesn't >> matter. Given that bouncing happens for mmio only, this would be all >> devices which you'd want to use this interface with anyway. >> > > That would be one alternative but isn't it the case that (for example) > with a partial DMA completion, the guest can assume that the > supposedly-untouched parts of the DMA target memory actually remain > untouched rather than (say) zeroed ? > For block devices, I don't think it can. In any case, this will only occur with mmio. I don't think the guest can assume much in such cases. In fact, we could even say that the virtual hardware doesn't support dma-to-mmio at all and MCE the guest. I'm sure no x86 guest would even notice. Don't know about non-x86. > In a system where we're trying to do zero copy, we may issue the map > request for a large transfer, before we know how much the host kernel > will actually provide. > > Won't it be at least 1GB? Partition you requests to that size. >> (I'm assuming that you'll implement the fastpath by directly mapping >> guest memory, not bouncing). >> > > Yes. We can do that in Xen too but it's less of a priority for us > given that we expect people who really care about performance to > install PV drivers in the guest. > I'm all in favor of accommodating Xen, but as long as you're out-of-tree you need to conform to qemu, not the other way around. >> A variant of this API (posted by Andrea) hid all of the scheduling away >> within the implementation. >> > > I remember seeing this before but I don't think your previous one > provided a callback for map completion ? I thought it just blocked > the caller until the map could complete. That's obviously not ideal. > It didn't block, it scheduled. >>> This function should return a separate handle as well as the physical >>> memory pointer. That will make it much easier to provide an >>> implementation which permits multiple bounce buffers or multiple >>> mappings simultaneously. >>> >> The downside to a separate handle is that device emulation code will now >> need to maintain the handle in addition to the the virtual address. >> Since the addresses will typically be maintained in an iovec, this means >> another array to be allocated and resized. >> > > Err, no, I don't really see that. In my proposal the `handle' is > actually allocated by the caller. The implementation provides the > private data and that can be empty. There is no additional memory > allocation. > You need to store multiple handles (one per sg element), so you need to allocate a variable size vector for it. Preallocation may be possible but perhaps wasteful. > >> The design goals here were to keep things as simple as possible for the >> fast path. Since the API fits all high-bandwidth devices that I know >> of, I don't think it's a good tradeoff to make the API more complex in >> order to be applicable to some corner cases. >> > > I think my question about partial DMA writes is very relevant. If we > don't care about that, nor about the corresponding notification for > reads, then the API can be a lot simpler. I don't see a concrete reason to care about it. -- error compiling committee.c: too many arguments to function