From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44512) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SUoG6-0002Un-OD for qemu-devel@nongnu.org; Wed, 16 May 2012 20:07:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SUoG5-0003El-0w for qemu-devel@nongnu.org; Wed, 16 May 2012 20:07:58 -0400 Received: from gate.crashing.org ([63.228.1.57]:53562) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SUoG4-0003EQ-O5 for qemu-devel@nongnu.org; Wed, 16 May 2012 20:07:56 -0400 Message-ID: <1337213257.30558.22.camel@pasglop> From: Benjamin Herrenschmidt Date: Thu, 17 May 2012 10:07:37 +1000 In-Reply-To: <4FB4028F.7070003@codemonkey.ws> References: <1336625347-10169-1-git-send-email-benh@kernel.crashing.org> <1336625347-10169-14-git-send-email-benh@kernel.crashing.org> <4FB1A8BF.7060503@codemonkey.ws> <20120515014449.GF30229@truffala.fritz.box> <1337142938.6727.122.camel@pasglop> <4FB4028F.7070003@codemonkey.ws> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA RW function List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: "Michael S. Tsirkin" , qemu-devel@nongnu.org, David Gibson So followup .... For those on the list: Anthony and I had a chat and we agree that a better thing to do is to have all cpu_physical_memory_* accesses to be ordered in program order from the perspective of the VCPUs. Devices that have performance critical accesses and want to do home made ordering can use map/unmap. Now looking at the code, however, there seem to be a lot of duplication, ie cpu_physical_memory_rw() is an obvious choice to add a barrier but what about all of the ldl_*, ldq_* etc... ? In fact there's about 45 different ways code can dig into guest memory, should they all be made ordered ? At this point, it might be easier to just stick a barrier in qemu_get_ram_ptr() which seems to be called by everybody however that means that things like cpu_physical_memory_rw() will end up hitting the barrier for every page. It's safe but it might be a performance hit (measurable ? I can give it a try, probably not). Or we can just sprinkle the barrier everywhere, mostly it's going to be in exec.c, all the "ram" cases in ld*_* and st*_*. Also, should I make the barrier conditional to kvm_enabled() ? IE. It's pointless in full emulation and might actually be a performance hit on something already quite slow... Cheers, Ben.