From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LPKJt-00044O-32 for qemu-devel@nongnu.org; Tue, 20 Jan 2009 12:23:21 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LPKJr-000410-1l for qemu-devel@nongnu.org; Tue, 20 Jan 2009 12:23:20 -0500 Received: from [199.232.76.173] (port=38207 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LPKJq-00040I-Cq for qemu-devel@nongnu.org; Tue, 20 Jan 2009 12:23:18 -0500 Received: from mx2.redhat.com ([66.187.237.31]:39384) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LPKJp-00042N-PU for qemu-devel@nongnu.org; Tue, 20 Jan 2009 12:23:18 -0500 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n0KHNHHU018024 for ; Tue, 20 Jan 2009 12:23:17 -0500 Message-ID: <49760882.6010309@redhat.com> Date: Tue, 20 Jan 2009 19:23:14 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API References: <1232308399-21679-1-git-send-email-avi@redhat.com> <1232308399-21679-2-git-send-email-avi@redhat.com> <18804.34053.211615.181730@mariner.uk.xensource.com> <4974943B.4020507@redhat.com> <18804.44271.868488.32192@mariner.uk.xensource.com> <4974B82F.9020805@redhat.com> <18804.48642.929024.908906@mariner.uk.xensource.com> <4974C694.8070004@redhat.com> <18805.57449.348449.492647@mariner.uk.xensource.com> In-Reply-To: <18805.57449.348449.492647@mariner.uk.xensource.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Ian Jackson wrote: > I think the key points in Avi's message are this: > > Avi Kivity writes: > >> You don't know afterwards either. Maybe read() is specced as you >> say, but practical implementations will return the minimum bytes >> read, not exact. >> > > And this: > > >> I really doubt that any guest will be affected by this. It's a tradeoff >> between decent performance and needlessly accurate emulation. I don't >> see how we can choose the latter. >> > > I don't think this is the right way to analyse this situation. We are > trying to define a general-purpose DMA API for _all_ emulated devices, > not just the IDE emulation and block devices that you seem to be > considering. > No. There already exists a general API: cpu_physical_memory_rw(). We are trying to define an API which will allow the high-throughput devices (IDE, scsi, virtio-blk, virtio-net) to be implemented efficiently. If device X does not work well with the API, then, unless it's important for some reason, it shouldn't use it. If it is important, we can adapt the API then. > If there is ever any hardware which behaves `properly' with partial > DMA, and any host kernel device which can tell us what succeeded and > what failed, then it is necessary for the DMA API we are now inventing > to allow that device to be properly emulated. > > Even if we can't come up with an example right now of such a device > then I would suggest that it's very likely that we will encounter one > eventually. But actually I can think of one straight away: a SCSI > tapestreamer. Tapestreamers often give partial transfers at the end > of tapefiles; hosts (ie, qemu guests) talking to the SCSI controller > do not expect the controller to DMA beyond the successful SCSI > transfer length; and the (qemu host's) kernel's read() call will not > overwrite beyond the successful transfer length either. > That will work out fine as the DMA will be to kernel memory, and read() will copy just the interesting parts. > If it is difficult for a block device to provide the faithful > behaviour then it might be acceptable for the block device to always > indicate to the DMA API that the entire transfer had taken place, even > though actually some of it had failed. > > But personally I think you're mistaken about the behaviour of the > (qemu host's) kernel's {aio_,p,}read(2). > I'm pretty sure reads to software RAIDs will be submitted in parallel. If those reads are O_DIRECT, then it's impossible to maintain DMA ordering. >>> In the initial implementation in Xen, we will almost certainly simply >>> emulate everything with cpu_physical_memory_rw. So it will happen all >>> the time. >>> >> Try it out. I'm sure it will work just fine (if incredibly slowly, >> unless you provide multiple bounce buffers). >> > > It will certainly work except (a) there are partial (interrupted) > transfers and (b) the host relies on the partial DMA not overwriting > more data than it successfully transferred. So what that means is > that if this introduces bugs they will be very difficult to find in > testing. I don't think testing is the answer here. > The only workaround I can think of is not to DMA. But that will be horribly slow. -- error compiling committee.c: too many arguments to function