From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LOz5J-00082R-Fy for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:42:53 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LOz5G-00081n-N0 for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:42:51 -0500 Received: from [199.232.76.173] (port=51485 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LOz5F-00081e-Jc for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:42:50 -0500 Received: from mx2.redhat.com ([66.187.237.31]:33081) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LOz5E-00021P-7d for qemu-devel@nongnu.org; Mon, 19 Jan 2009 13:42:49 -0500 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n0JIgkeW025777 for ; Mon, 19 Jan 2009 13:42:46 -0500 Message-ID: <4974C9BA.1050803@redhat.com> Date: Mon, 19 Jan 2009 20:43:06 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API References: <1232308399-21679-1-git-send-email-avi@redhat.com> <1232308399-21679-2-git-send-email-avi@redhat.com> <18804.34053.211615.181730@mariner.uk.xensource.com> <4974943B.4020507@redhat.com> <18804.44271.868488.32192@mariner.uk.xensource.com> <4974B82F.9020805@redhat.com> <20090119182502.GA2080@shareable.org> In-Reply-To: <20090119182502.GA2080@shareable.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Jamie Lokier wrote: > Avi Kivity wrote: > >> In fact, we could even say that the virtual hardware doesn't support >> dma-to-mmio at all and MCE the guest. I'm sure no x86 guest would even >> notice. Don't know about non-x86. >> > > Guest userspace does: > > 1. mmap() framebuffer device. > 2. read() from file opened with O_DIRECT. > > Both are allowed by non-root processes on Linux. > > (I imagine this might be more common in some obscure DOS programs though). > > Think also variation with reading from a video capture device into > video memory. I've seen that done on x86, never seen it (yet) > on non-x86 :-) > > However, that is known to break on some PCI bridges. > > I'm not sure if it's reasonable to abort emulation with an MCE in this > case. > > Framebuffers are mapped as RAM, so we won't bounce this case. Try harder :) >>> I think my question about partial DMA writes is very relevant. If we >>> don't care about that, nor about the corresponding notification for >>> reads, then the API can be a lot simpler. >>> >> I don't see a concrete reason to care about it. >> > > Writing zeros or junk after a partial DMA is quite different to real > hardware behaviour. Virtually all devices with a "DMA count" > register are certain to have not written to a later address when DMA stops. > > The devices we're talking about here don't have a DMA count register. They are passed scatter-gather lists, and I don't think they make guarantees about the order in which they're accessed. > QEMU tries to do a fairly good job at emulating devices with many of > their quirks. It would be odd if the high-performance API got in the > way of high-quality device emulation, when that's wanted. > > Potential example: If a graphics card or video capture card, or USB > webcam etc. (more likely!) is doing a large streaming DMA into a > guests's userspace process when that process calls read() (in the > guest OS), and the DMA is stopped for any reason, such as triggered by > a guest OS SIGINT or simply the data having ended, the guest's > userspace can reasonably assume data after the count returned by > read() is untouched. > This DMA will be into RAM, not mmio. > Just as importantly, the guest OS in that example can assume that the > later pages are not dirtied, therefore not swap them, or return them > to its pre-zero pool or whatever. This is a legitimate guest OS > optimisation for streaming-DMA-with-unknown-length devices. This can > happen without a userspace process too. > > I'm guessing truncated DMAs using this API are always going to dirty > only an initial part of the buffer, not arbitrary regions. (In rare > cases where this isn't true, don't use the API). > > So wouldn't it be trivial to pass "amount written" to the unmap > function - to be used in the bounce buffer case? > We don't have a reliable amount to pass. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.