From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37192) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SUox9-0000fw-RD for qemu-devel@nongnu.org; Wed, 16 May 2012 20:52:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SUox7-0008JA-U1 for qemu-devel@nongnu.org; Wed, 16 May 2012 20:52:27 -0400 Received: from gate.crashing.org ([63.228.1.57]:41880) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SUox7-0008Ig-K1 for qemu-devel@nongnu.org; Wed, 16 May 2012 20:52:25 -0400 Message-ID: <1337215928.30558.28.camel@pasglop> From: Benjamin Herrenschmidt Date: Thu, 17 May 2012 10:52:08 +1000 In-Reply-To: <1337214293.30558.25.camel@pasglop> References: <1336625347-10169-1-git-send-email-benh@kernel.crashing.org> <1336625347-10169-14-git-send-email-benh@kernel.crashing.org> <4FB1A8BF.7060503@codemonkey.ws> <20120515014449.GF30229@truffala.fritz.box> <1337142938.6727.122.camel@pasglop> <4FB4028F.7070003@codemonkey.ws> <1337213257.30558.22.camel@pasglop> <1337214293.30558.25.camel@pasglop> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Subject: [Qemu-devel] [RFC/PATCH] Add a memory barrier to guest memory access functions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: "Michael S. Tsirkin" , qemu-devel@nongnu.org, David Gibson The emulated devices can run simultaneously with the guest, so we need to be careful with ordering of load and stores done by them to the guest system memory, which need to be observed in the right order by the guest operating system. In order to avoid unnecessary overhead on i386, we define a new barrier dma_mb() which is a full barrier on powerpc and a nop on i386 and x86_64 (see the comment I added in the code). This barrier is then added to qemu_get_ram_ptr() which is easier than sprinkling into all the functions that provide guest access and are all more or less open coded. Signed-off-by: Benjamin Herrenschmidt --- Discussion: So the other option is to do it only in cpu_physical_memory_rw() and leave the responsibility to use explicit barriers to the callers of ld*/st* accessors. For example virtio already does it in a few places explicitly. exec.c | 11 +++++++++++ qemu-barrier.h | 29 ++++++++++++++++++++++++++--- 2 files changed, 37 insertions(+), 3 deletions(-) diff --git a/exec.c b/exec.c index 40cf52d..fc857b6 100644 --- a/exec.c +++ b/exec.c @@ -25,6 +25,7 @@ #endif #include "qemu-common.h" +#include "qemu-barrier.h" #include "cpu.h" #include "tcg.h" #include "hw/hw.h" @@ -2794,6 +2795,9 @@ void *qemu_get_ram_ptr(ram_addr_t addr) { RAMBlock *block; + /* We ensure ordering for all DMA transactions */ + dma_mb(); + QLIST_FOREACH(block, &ram_list.blocks, next) { if (addr - block->offset < block->length) { /* Move this entry to to start of the list. */ @@ -2830,6 +2834,9 @@ void *qemu_safe_ram_ptr(ram_addr_t addr) { RAMBlock *block; + /* We ensure ordering for all DMA transactions */ + dma_mb(); + QLIST_FOREACH(block, &ram_list.blocks, next) { if (addr - block->offset < block->length) { if (xen_enabled()) { @@ -2861,6 +2868,10 @@ void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size) if (*size == 0) { return NULL; } + + /* We ensure ordering for all DMA transactions */ + dma_mb(); + if (xen_enabled()) { return xen_map_cache(addr, *size, 1); } else { diff --git a/qemu-barrier.h b/qemu-barrier.h index 7e11197..8c62683 100644 --- a/qemu-barrier.h +++ b/qemu-barrier.h @@ -23,7 +23,21 @@ #define smp_mb() __sync_synchronize() #else #define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory") -#endif + #endif + +/* + * DMA barrier is used to order accesses from qemu devices to + * guest memory, in order to make them appear to the guest in + * program order. + * + * We assume that we never uses non-temporal accesses for such + * DMA and so don't need anything other than a compiler barrier + * + * If some devices use weakly ordered SSE load/store instructions + * then those devices will be responsible for using the appropriate + * barriers as well. + */ +#define dma_mb() barrier() #elif defined(__x86_64__) @@ -31,6 +45,9 @@ #define smp_rmb() barrier() #define smp_mb() asm volatile("mfence" ::: "memory") +/* Same comment as i386 */ +#define dma_mb() barrier() + #elif defined(_ARCH_PPC) /* @@ -46,8 +63,13 @@ #define smp_rmb() asm volatile("sync" ::: "memory") #endif -#define smp_mb() asm volatile("sync" ::: "memory") +#define smp_mb() asm volatile("sync" ::: "memory") +/* + * We use a full barrier for DMA which encompass the full + * requirements of the PCI ordering model. + */ +#define dma_mb() smp_mb() #else /* @@ -57,8 +79,9 @@ * be overkill for wmb() and rmb(). */ #define smp_wmb() __sync_synchronize() -#define smp_mb() __sync_synchronize() +#define smp_mb() __sync_synchronize() #define smp_rmb() __sync_synchronize() +#define dma_rmb() __sync_synchronize() #endif