qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic
@ 2014-12-02 11:23 Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions Stefan Hajnoczi
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

v2:
 * Make bitmap_set_atomic() and bitmap_test_and_clear_atomic() faster by
   replacing the inner loop with cheaper operations.  I did use smp_wmb() in
   bitmap_set_atomic() so that the function is always a write barrier, no
   matter which code path is taken. [Paolo]

The dirty_memory[] bitmap is used for live migration, TCG self-modifying code
detection, and VGA emulation.  Up until now the bitmap was always accessed
under the QEMU global mutex.  This series makes all dirty_memory[] accesses
atomic to prepare the way for threads writing to guest memory without holding
the global mutex.

In particular, this series converts non-atomic dirty_memory[] accesses to
atomic_or, atomic_xchg, and atomic_fetch_and so that race conditions are
avoided when two threads manipulate the bitmap at the same time.

There are two pieces remaining before the dirty_memory[] bitmap is truly
thread-safe:

1. Convert all cpu_physical_memory_*_dirty() callers to use the API atomically.
   There are TCG callers who things along the lines of:

     if (!cpu_physical_memory_get_dirty(addr)) {
         cpu_physical_memory_set_dirty(addr);  /* not atomic! */
     }

   At first I considered ignoring TCG completely since it is currently not
   multi-threaded, but we might as well tackle this so that
   virtio-blk/virtio-scsi dataplane can be used with TCG eventually (they still
   need ioeventfd/irqfd emulation before they can support TCG).

2. Use array RCU to safely resize dirty_memory[] for memory hotplug.  Currently
   ram_block_add() grows the bitmap in a way that is not thread-safe.

   Paolo has a QEMU userspace RCU implementation which I'd like to bring in for
   this.

Stefan Hajnoczi (6):
  bitmap: add atomic set functions
  bitmap: add atomic test and clear
  memory: use atomic ops for setting dirty memory bits
  migration: move dirty bitmap sync to ram_addr.h
  memory: replace cpu_physical_memory_reset_dirty() with test-and-clear
  memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic

 arch_init.c             | 46 ++----------------------------
 cputlb.c                |  4 +--
 exec.c                  | 23 +++++++++++----
 include/exec/ram_addr.h | 74 ++++++++++++++++++++++++++++++++++++-------------
 include/qemu/bitmap.h   |  4 +++
 include/qemu/bitops.h   | 14 ++++++++++
 memory.c                | 11 +++-----
 util/bitmap.c           | 73 ++++++++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 170 insertions(+), 79 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions
  2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
@ 2014-12-02 11:23 ` Stefan Hajnoczi
  2014-12-02 12:28   ` Paolo Bonzini
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 2/6] bitmap: add atomic test and clear Stefan Hajnoczi
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

Use atomic_or() for atomic bitmaps where several threads may set bits at
the same time.  This avoids the race condition between threads loading
an element, bitwise ORing, and then storing the element.

When setting all bits in a word we can avoid atomic ops and instead just
use an smp_wmb() write barrier at the end.

Most bitmap users don't need atomicity so introduce new functions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/bitmap.h |  2 ++
 include/qemu/bitops.h | 14 ++++++++++++++
 util/bitmap.c         | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index f0273c9..3e0a4f3 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -39,6 +39,7 @@
  * bitmap_empty(src, nbits)			Are all bits zero in *src?
  * bitmap_full(src, nbits)			Are all bits set in *src?
  * bitmap_set(dst, pos, nbits)			Set specified bit area
+ * bitmap_set_atomic(dst, pos, nbits)   Set specified bit area with atomic ops
  * bitmap_clear(dst, pos, nbits)		Clear specified bit area
  * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
  */
@@ -226,6 +227,7 @@ static inline int bitmap_intersects(const unsigned long *src1,
 }
 
 void bitmap_set(unsigned long *map, long i, long len);
+void bitmap_set_atomic(unsigned long *map, long i, long len);
 void bitmap_clear(unsigned long *map, long start, long nr);
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
                                          unsigned long size,
diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
index 181bd46..eda4132 100644
--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -16,6 +16,7 @@
 #include <assert.h>
 
 #include "host-utils.h"
+#include "atomic.h"
 
 #define BITS_PER_BYTE           CHAR_BIT
 #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
@@ -39,6 +40,19 @@ static inline void set_bit(long nr, unsigned long *addr)
 }
 
 /**
+ * set_bit_atomic - Set a bit in memory atomically
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ */
+static inline void set_bit_atomic(long nr, unsigned long *addr)
+{
+    unsigned long mask = BIT_MASK(nr);
+    unsigned long *p = addr + BIT_WORD(nr);
+
+    atomic_or(p, mask);
+}
+
+/**
  * clear_bit - Clears a bit in memory
  * @nr: Bit to clear
  * @addr: Address to start counting from
diff --git a/util/bitmap.c b/util/bitmap.c
index 9c6bb52..40db0d9 100644
--- a/util/bitmap.c
+++ b/util/bitmap.c
@@ -11,6 +11,7 @@
 
 #include "qemu/bitops.h"
 #include "qemu/bitmap.h"
+#include "qemu/atomic.h"
 
 /*
  * bitmaps provide an array of bits, implemented using an an
@@ -177,6 +178,41 @@ void bitmap_set(unsigned long *map, long start, long nr)
     }
 }
 
+void bitmap_set_atomic(unsigned long *map, long start, long nr)
+{
+    unsigned long *p = map + BIT_WORD(start);
+    const long size = start + nr;
+    int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+    unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+    /* First word */
+    if (nr - bits_to_set > 0) {
+        atomic_or(p, mask_to_set);
+        nr -= bits_to_set;
+        bits_to_set = BITS_PER_LONG;
+        p++;
+    }
+
+    /* Full words */
+    while (nr - bits_to_set >= 0) {
+        *p = ~0UL;
+        nr -= bits_to_set;
+        p++;
+    }
+
+    /* Last word */
+    if (nr) {
+        mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+        atomic_or(p, mask_to_set);
+    } else {
+        /* If we avoided the full barrier in atomic_or() we should at least
+         * issue a write barrier so that other threads using barriers see
+         * up-to-date bitmap contents.
+         */
+        smp_wmb();
+    }
+}
+
 void bitmap_clear(unsigned long *map, long start, long nr)
 {
     unsigned long *p = map + BIT_WORD(start);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [RFC v2 2/6] bitmap: add atomic test and clear
  2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions Stefan Hajnoczi
@ 2014-12-02 11:23 ` Stefan Hajnoczi
  2014-12-02 12:16   ` Paolo Bonzini
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 3/6] memory: use atomic ops for setting dirty memory bits Stefan Hajnoczi
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

The new bitmap_test_and_clear_atomic() function clears a range and
returns whether or not the bits were set.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/bitmap.h |  2 ++
 util/bitmap.c         | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 3e0a4f3..86dd9cd 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -41,6 +41,7 @@
  * bitmap_set(dst, pos, nbits)			Set specified bit area
  * bitmap_set_atomic(dst, pos, nbits)   Set specified bit area with atomic ops
  * bitmap_clear(dst, pos, nbits)		Clear specified bit area
+ * bitmap_test_and_clear_atomic(dst, pos, nbits)    Test and clear area
  * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
  */
 
@@ -229,6 +230,7 @@ static inline int bitmap_intersects(const unsigned long *src1,
 void bitmap_set(unsigned long *map, long i, long len);
 void bitmap_set_atomic(unsigned long *map, long i, long len);
 void bitmap_clear(unsigned long *map, long start, long nr);
+bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr);
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
                                          unsigned long size,
                                          unsigned long start,
diff --git a/util/bitmap.c b/util/bitmap.c
index 40db0d9..d297066 100644
--- a/util/bitmap.c
+++ b/util/bitmap.c
@@ -233,6 +233,43 @@ void bitmap_clear(unsigned long *map, long start, long nr)
     }
 }
 
+bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr)
+{
+    unsigned long *p = map + BIT_WORD(start);
+    const long size = start + nr;
+    int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+    unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+    unsigned long dirty = 0;
+    unsigned long old_bits;
+
+    /* First word */
+    if (nr - bits_to_clear > 0) {
+        old_bits = atomic_fetch_and(p, ~mask_to_clear);
+        dirty |= old_bits & mask_to_clear;
+        nr -= bits_to_clear;
+        bits_to_clear = BITS_PER_LONG;
+        mask_to_clear = ~0UL;
+        p++;
+    }
+
+    /* Full words */
+    while (nr - bits_to_clear >= 0) {
+        old_bits = atomic_xchg(p, 0);
+        dirty |= old_bits;
+        nr -= bits_to_clear;
+        p++;
+    }
+
+    /* Last word */
+    if (nr) {
+        mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+        old_bits = atomic_fetch_and(p, ~mask_to_clear);
+        dirty |= old_bits & mask_to_clear;
+    }
+
+    return dirty;
+}
+
 #define ALIGN_MASK(x,mask)      (((x)+(mask))&~(mask))
 
 /**
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [RFC v2 3/6] memory: use atomic ops for setting dirty memory bits
  2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 2/6] bitmap: add atomic test and clear Stefan Hajnoczi
@ 2014-12-02 11:23 ` Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 4/6] migration: move dirty bitmap sync to ram_addr.h Stefan Hajnoczi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

Use set_bit_atomic() and bitmap_set_atomic() so that multiple threads
can dirty memory without race conditions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
I had to get creative to stay under 80 characters per line.  I'm open to
suggestions if you prefer me to format it another way.
---
 include/exec/ram_addr.h | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 8fc75cd..ba90daa 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -93,30 +93,32 @@ static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
                                                       unsigned client)
 {
     assert(client < DIRTY_MEMORY_NUM);
-    set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
+    set_bit_atomic(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
 }
 
 static inline void cpu_physical_memory_set_dirty_range_nocode(ram_addr_t start,
                                                               ram_addr_t length)
 {
     unsigned long end, page;
+    unsigned long **dirty_memory = ram_list.dirty_memory;
 
     end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
     page = start >> TARGET_PAGE_BITS;
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
+    bitmap_set_atomic(dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
+    bitmap_set_atomic(dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
 }
 
 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
                                                        ram_addr_t length)
 {
     unsigned long end, page;
+    unsigned long **dirty_memory = ram_list.dirty_memory;
 
     end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
     page = start >> TARGET_PAGE_BITS;
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
+    bitmap_set_atomic(dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
+    bitmap_set_atomic(dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
+    bitmap_set_atomic(dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
     xen_modified_memory(start, length);
 }
 
@@ -142,10 +144,11 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
         for (k = 0; k < nr; k++) {
             if (bitmap[k]) {
                 unsigned long temp = leul_to_cpu(bitmap[k]);
+                unsigned long **d = ram_list.dirty_memory;
 
-                ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION][page + k] |= temp;
-                ram_list.dirty_memory[DIRTY_MEMORY_VGA][page + k] |= temp;
-                ram_list.dirty_memory[DIRTY_MEMORY_CODE][page + k] |= temp;
+                atomic_or(&d[DIRTY_MEMORY_MIGRATION][page + k], temp);
+                atomic_or(&d[DIRTY_MEMORY_VGA][page + k], temp);
+                atomic_or(&d[DIRTY_MEMORY_CODE][page + k], temp);
             }
         }
         xen_modified_memory(start, pages);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [RFC v2 4/6] migration: move dirty bitmap sync to ram_addr.h
  2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 3/6] memory: use atomic ops for setting dirty memory bits Stefan Hajnoczi
@ 2014-12-02 11:23 ` Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 5/6] memory: replace cpu_physical_memory_reset_dirty() with test-and-clear Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 6/6] memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic Stefan Hajnoczi
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

The dirty memory bitmap is managed by ram_addr.h and copied to
migration_bitmap[] periodically during live migration.

Move the code to sync the bitmap to ram_addr.h where related code lives.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 arch_init.c             | 46 ++--------------------------------------------
 include/exec/ram_addr.h | 44 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+), 44 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 7680d28..79c7784 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -436,52 +436,10 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
     return (next - base) << TARGET_PAGE_BITS;
 }
 
-static inline bool migration_bitmap_set_dirty(ram_addr_t addr)
-{
-    bool ret;
-    int nr = addr >> TARGET_PAGE_BITS;
-
-    ret = test_and_set_bit(nr, migration_bitmap);
-
-    if (!ret) {
-        migration_dirty_pages++;
-    }
-    return ret;
-}
-
 static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
 {
-    ram_addr_t addr;
-    unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
-
-    /* start address is aligned at the start of a word? */
-    if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
-        int k;
-        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
-        unsigned long *src = ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION];
-
-        for (k = page; k < page + nr; k++) {
-            if (src[k]) {
-                unsigned long new_dirty;
-                new_dirty = ~migration_bitmap[k];
-                migration_bitmap[k] |= src[k];
-                new_dirty &= src[k];
-                migration_dirty_pages += ctpopl(new_dirty);
-                src[k] = 0;
-            }
-        }
-    } else {
-        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
-            if (cpu_physical_memory_get_dirty(start + addr,
-                                              TARGET_PAGE_SIZE,
-                                              DIRTY_MEMORY_MIGRATION)) {
-                cpu_physical_memory_reset_dirty(start + addr,
-                                                TARGET_PAGE_SIZE,
-                                                DIRTY_MEMORY_MIGRATION);
-                migration_bitmap_set_dirty(start + addr);
-            }
-        }
-    }
+    migration_dirty_pages +=
+        cpu_physical_memory_sync_dirty_bitmap(migration_bitmap, start, length);
 }
 
 
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index ba90daa..87a8b28 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -190,5 +190,49 @@ static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
                                      unsigned client);
 
+static inline
+uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned long *dest,
+                                               ram_addr_t start,
+                                               ram_addr_t length)
+{
+    ram_addr_t addr;
+    unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
+    uint64_t num_dirty = 0;
+
+    /* start address is aligned at the start of a word? */
+    if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
+        int k;
+        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+        unsigned long *src = ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION];
+
+        for (k = page; k < page + nr; k++) {
+            if (src[k]) {
+                unsigned long new_dirty;
+                new_dirty = ~dest[k];
+                dest[k] |= src[k];
+                new_dirty &= src[k];
+                num_dirty += ctpopl(new_dirty);
+                src[k] = 0;
+            }
+        }
+    } else {
+        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+            if (cpu_physical_memory_get_dirty(start + addr,
+                                              TARGET_PAGE_SIZE,
+                                              DIRTY_MEMORY_MIGRATION)) {
+                cpu_physical_memory_reset_dirty(start + addr,
+                                                TARGET_PAGE_SIZE,
+                                                DIRTY_MEMORY_MIGRATION);
+                long k = (start + addr) >> TARGET_PAGE_BITS;
+                if (!test_and_set_bit(k, dest)) {
+                    num_dirty++;
+                }
+            }
+        }
+    }
+
+    return num_dirty;
+}
+
 #endif
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [RFC v2 5/6] memory: replace cpu_physical_memory_reset_dirty() with test-and-clear
  2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 4/6] migration: move dirty bitmap sync to ram_addr.h Stefan Hajnoczi
@ 2014-12-02 11:23 ` Stefan Hajnoczi
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 6/6] memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic Stefan Hajnoczi
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty().  This is not atomic since
two separate accesses to the dirty memory bitmap are made.

Turn cpu_physical_memory_reset_dirty() into the atomic
cpu_physical_memory_test_and_clear_dirty().

The cpu_physical_memory_clear_dirty_range() function is no longer used
and can be dropped.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 cputlb.c                |  4 ++--
 exec.c                  | 23 +++++++++++++++++------
 include/exec/ram_addr.h | 27 +++++++--------------------
 memory.c                | 11 ++++-------
 4 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index a55518a..3d5b9a7 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -125,8 +125,8 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
    can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
 {
-    cpu_physical_memory_reset_dirty(ram_addr, TARGET_PAGE_SIZE,
-                                    DIRTY_MEMORY_CODE);
+    cpu_physical_memory_test_and_clear_dirty(ram_addr, TARGET_PAGE_SIZE,
+                                             DIRTY_MEMORY_CODE);
 }
 
 /* update the TLB so that writes in physical page 'phys_addr' are no longer
diff --git a/exec.c b/exec.c
index 71ac104..72bdb2e 100644
--- a/exec.c
+++ b/exec.c
@@ -845,16 +845,27 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t length)
 }
 
 /* Note: start and end must be within the same ram block.  */
-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
-                                     unsigned client)
+bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
+                                              ram_addr_t length,
+                                              unsigned client)
 {
-    if (length == 0)
-        return;
-    cpu_physical_memory_clear_dirty_range(start, length, client);
+    unsigned long end, page;
+    bool dirty;
 
-    if (tcg_enabled()) {
+    if (length == 0) {
+        return false;
+    }
+
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    dirty = bitmap_test_and_clear_atomic(ram_list.dirty_memory[client],
+                                         page, end - page);
+
+    if (dirty && tcg_enabled()) {
         tlb_reset_dirty_range_all(start, length);
     }
+
+    return dirty;
 }
 
 static void cpu_physical_memory_set_dirty_tracking(bool enable)
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 87a8b28..cdcbe9a 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -175,20 +175,9 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
 }
 #endif /* not _WIN32 */
 
-static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
-                                                         ram_addr_t length,
-                                                         unsigned client)
-{
-    unsigned long end, page;
-
-    assert(client < DIRTY_MEMORY_NUM);
-    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
-    page = start >> TARGET_PAGE_BITS;
-    bitmap_clear(ram_list.dirty_memory[client], page, end - page);
-}
-
-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
-                                     unsigned client);
+bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
+                                              ram_addr_t length,
+                                              unsigned client);
 
 static inline
 uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned long *dest,
@@ -217,12 +206,10 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned long *dest,
         }
     } else {
         for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
-            if (cpu_physical_memory_get_dirty(start + addr,
-                                              TARGET_PAGE_SIZE,
-                                              DIRTY_MEMORY_MIGRATION)) {
-                cpu_physical_memory_reset_dirty(start + addr,
-                                                TARGET_PAGE_SIZE,
-                                                DIRTY_MEMORY_MIGRATION);
+            if (cpu_physical_memory_test_and_clear_dirty(
+                        start + addr,
+                        TARGET_PAGE_SIZE,
+                        DIRTY_MEMORY_MIGRATION)) {
                 long k = (start + addr) >> TARGET_PAGE_BITS;
                 if (!test_and_set_bit(k, dest)) {
                     num_dirty++;
diff --git a/memory.c b/memory.c
index 15cf9eb..4d7761e 100644
--- a/memory.c
+++ b/memory.c
@@ -1375,13 +1375,9 @@ void memory_region_set_dirty(MemoryRegion *mr, hwaddr addr,
 bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,
                                         hwaddr size, unsigned client)
 {
-    bool ret;
     assert(mr->terminates);
-    ret = cpu_physical_memory_get_dirty(mr->ram_addr + addr, size, client);
-    if (ret) {
-        cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
-    }
-    return ret;
+    return cpu_physical_memory_test_and_clear_dirty(mr->ram_addr + addr,
+                                                    size, client);
 }
 
 
@@ -1425,7 +1421,8 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr addr,
                                hwaddr size, unsigned client)
 {
     assert(mr->terminates);
-    cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
+    cpu_physical_memory_test_and_clear_dirty(mr->ram_addr + addr, size,
+                                             client);
 }
 
 int memory_region_get_fd(MemoryRegion *mr)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [RFC v2 6/6] memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic
  2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 5/6] memory: replace cpu_physical_memory_reset_dirty() with test-and-clear Stefan Hajnoczi
@ 2014-12-02 11:23 ` Stefan Hajnoczi
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-12-02 11:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Dr. David Alan Gilbert, Stefan Hajnoczi,
	Peter Maydell

The fast path of cpu_physical_memory_sync_dirty_bitmap() directly
manipulates the dirty bitmap.  Use atomic_xchg() to make the
test-and-clear atomic.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/exec/ram_addr.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index cdcbe9a..3d2a4c1 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -195,13 +195,13 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned long *dest,
         unsigned long *src = ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION];
 
         for (k = page; k < page + nr; k++) {
-            if (src[k]) {
+            unsigned long bits = atomic_xchg(&src[k], 0);
+            if (bits) {
                 unsigned long new_dirty;
                 new_dirty = ~dest[k];
-                dest[k] |= src[k];
-                new_dirty &= src[k];
+                dest[k] |= bits;
+                new_dirty &= bits;
                 num_dirty += ctpopl(new_dirty);
-                src[k] = 0;
             }
         }
     } else {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [RFC v2 2/6] bitmap: add atomic test and clear
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 2/6] bitmap: add atomic test and clear Stefan Hajnoczi
@ 2014-12-02 12:16   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2014-12-02 12:16 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Peter Maydell, Dr. David Alan Gilbert



On 02/12/2014 12:23, Stefan Hajnoczi wrote:
> The new bitmap_test_and_clear_atomic() function clears a range and
> returns whether or not the bits were set.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/qemu/bitmap.h |  2 ++
>  util/bitmap.c         | 37 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
> index 3e0a4f3..86dd9cd 100644
> --- a/include/qemu/bitmap.h
> +++ b/include/qemu/bitmap.h
> @@ -41,6 +41,7 @@
>   * bitmap_set(dst, pos, nbits)			Set specified bit area
>   * bitmap_set_atomic(dst, pos, nbits)   Set specified bit area with atomic ops
>   * bitmap_clear(dst, pos, nbits)		Clear specified bit area
> + * bitmap_test_and_clear_atomic(dst, pos, nbits)    Test and clear area
>   * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
>   */
>  
> @@ -229,6 +230,7 @@ static inline int bitmap_intersects(const unsigned long *src1,
>  void bitmap_set(unsigned long *map, long i, long len);
>  void bitmap_set_atomic(unsigned long *map, long i, long len);
>  void bitmap_clear(unsigned long *map, long start, long nr);
> +bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr);
>  unsigned long bitmap_find_next_zero_area(unsigned long *map,
>                                           unsigned long size,
>                                           unsigned long start,
> diff --git a/util/bitmap.c b/util/bitmap.c
> index 40db0d9..d297066 100644
> --- a/util/bitmap.c
> +++ b/util/bitmap.c
> @@ -233,6 +233,43 @@ void bitmap_clear(unsigned long *map, long start, long nr)
>      }
>  }
>  
> +bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr)
> +{
> +    unsigned long *p = map + BIT_WORD(start);
> +    const long size = start + nr;
> +    int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
> +    unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
> +    unsigned long dirty = 0;
> +    unsigned long old_bits;
> +
> +    /* First word */
> +    if (nr - bits_to_clear > 0) {
> +        old_bits = atomic_fetch_and(p, ~mask_to_clear);
> +        dirty |= old_bits & mask_to_clear;
> +        nr -= bits_to_clear;
> +        bits_to_clear = BITS_PER_LONG;
> +        mask_to_clear = ~0UL;
> +        p++;
> +    }
> +
> +    /* Full words */
> +    while (nr - bits_to_clear >= 0) {
> +        old_bits = atomic_xchg(p, 0);

Perhaps "if (!*p) continue;", and a smp_wmb or smp_mb at the end as in
patch 1?  That's at least what KVM does.

Paolo

> +        dirty |= old_bits;
> +        nr -= bits_to_clear;
> +        p++;
> +    }
> +
> +    /* Last word */
> +    if (nr) {
> +        mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
> +        old_bits = atomic_fetch_and(p, ~mask_to_clear);
> +        dirty |= old_bits & mask_to_clear;
> +    }
> +
> +    return dirty;
> +}
> +
>  #define ALIGN_MASK(x,mask)      (((x)+(mask))&~(mask))
>  
>  /**
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions
  2014-12-02 11:23 ` [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions Stefan Hajnoczi
@ 2014-12-02 12:28   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2014-12-02 12:28 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Peter Maydell, Dr. David Alan Gilbert



On 02/12/2014 12:23, Stefan Hajnoczi wrote:
> Use atomic_or() for atomic bitmaps where several threads may set bits at
> the same time.  This avoids the race condition between threads loading
> an element, bitwise ORing, and then storing the element.
> 
> When setting all bits in a word we can avoid atomic ops and instead just
> use an smp_wmb() write barrier at the end.
> 
> Most bitmap users don't need atomicity so introduce new functions.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/qemu/bitmap.h |  2 ++
>  include/qemu/bitops.h | 14 ++++++++++++++
>  util/bitmap.c         | 36 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 52 insertions(+)
> 
> diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
> index f0273c9..3e0a4f3 100644
> --- a/include/qemu/bitmap.h
> +++ b/include/qemu/bitmap.h
> @@ -39,6 +39,7 @@
>   * bitmap_empty(src, nbits)			Are all bits zero in *src?
>   * bitmap_full(src, nbits)			Are all bits set in *src?
>   * bitmap_set(dst, pos, nbits)			Set specified bit area
> + * bitmap_set_atomic(dst, pos, nbits)   Set specified bit area with atomic ops
>   * bitmap_clear(dst, pos, nbits)		Clear specified bit area
>   * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
>   */
> @@ -226,6 +227,7 @@ static inline int bitmap_intersects(const unsigned long *src1,
>  }
>  
>  void bitmap_set(unsigned long *map, long i, long len);
> +void bitmap_set_atomic(unsigned long *map, long i, long len);
>  void bitmap_clear(unsigned long *map, long start, long nr);
>  unsigned long bitmap_find_next_zero_area(unsigned long *map,
>                                           unsigned long size,
> diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
> index 181bd46..eda4132 100644
> --- a/include/qemu/bitops.h
> +++ b/include/qemu/bitops.h
> @@ -16,6 +16,7 @@
>  #include <assert.h>
>  
>  #include "host-utils.h"
> +#include "atomic.h"
>  
>  #define BITS_PER_BYTE           CHAR_BIT
>  #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
> @@ -39,6 +40,19 @@ static inline void set_bit(long nr, unsigned long *addr)
>  }
>  
>  /**
> + * set_bit_atomic - Set a bit in memory atomically
> + * @nr: the bit to set
> + * @addr: the address to start counting from
> + */
> +static inline void set_bit_atomic(long nr, unsigned long *addr)
> +{
> +    unsigned long mask = BIT_MASK(nr);
> +    unsigned long *p = addr + BIT_WORD(nr);
> +
> +    atomic_or(p, mask);
> +}
> +
> +/**
>   * clear_bit - Clears a bit in memory
>   * @nr: Bit to clear
>   * @addr: Address to start counting from
> diff --git a/util/bitmap.c b/util/bitmap.c
> index 9c6bb52..40db0d9 100644
> --- a/util/bitmap.c
> +++ b/util/bitmap.c
> @@ -11,6 +11,7 @@
>  
>  #include "qemu/bitops.h"
>  #include "qemu/bitmap.h"
> +#include "qemu/atomic.h"
>  
>  /*
>   * bitmaps provide an array of bits, implemented using an an
> @@ -177,6 +178,41 @@ void bitmap_set(unsigned long *map, long start, long nr)
>      }
>  }
>  
> +void bitmap_set_atomic(unsigned long *map, long start, long nr)
> +{
> +    unsigned long *p = map + BIT_WORD(start);
> +    const long size = start + nr;
> +    int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
> +    unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
> +
> +    /* First word */
> +    if (nr - bits_to_set > 0) {
> +        atomic_or(p, mask_to_set);
> +        nr -= bits_to_set;
> +        bits_to_set = BITS_PER_LONG;
> +        p++;
> +    }
> +
> +    /* Full words */
> +    while (nr - bits_to_set >= 0) {
> +        *p = ~0UL;
> +        nr -= bits_to_set;
> +        p++;
> +    }
> +
> +    /* Last word */
> +    if (nr) {
> +        mask_to_set &= BITMAP_LAST_WORD_MASK(size);
> +        atomic_or(p, mask_to_set);
> +    } else {
> +        /* If we avoided the full barrier in atomic_or() we should at least
> +         * issue a write barrier so that other threads using barriers see
> +         * up-to-date bitmap contents.
> +         */
> +        smp_wmb();

Why not smp_mb() since that's what atomic_or does?

You can also ensure that you always wrap the loop with atomic_ors (or do
one if setting a single word).  I think it can be like this:

    if (!nr) {
        return;
    }

    /* Always do one atomic_or at the beginning, except if one word.  */
    if (nr >= bits_to_set) {
        ...
    }

    /* Full words, leave the last word aside */
    while (nr - bits_to_set > BITS_PER_LONG) {
    }

    /* Last word */
    assert(nr > 0);
    ...

but maybe I messed up and there's some off-by-one somewhere in this sketch.

Paolo

> +    }
> +}
> +
>  void bitmap_clear(unsigned long *map, long start, long nr)
>  {
>      unsigned long *p = map + BIT_WORD(start);
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-12-02 12:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-02 11:23 [Qemu-devel] [RFC v2 0/6] memory: make dirty_memory[] accesses atomic Stefan Hajnoczi
2014-12-02 11:23 ` [Qemu-devel] [RFC v2 1/6] bitmap: add atomic set functions Stefan Hajnoczi
2014-12-02 12:28   ` Paolo Bonzini
2014-12-02 11:23 ` [Qemu-devel] [RFC v2 2/6] bitmap: add atomic test and clear Stefan Hajnoczi
2014-12-02 12:16   ` Paolo Bonzini
2014-12-02 11:23 ` [Qemu-devel] [RFC v2 3/6] memory: use atomic ops for setting dirty memory bits Stefan Hajnoczi
2014-12-02 11:23 ` [Qemu-devel] [RFC v2 4/6] migration: move dirty bitmap sync to ram_addr.h Stefan Hajnoczi
2014-12-02 11:23 ` [Qemu-devel] [RFC v2 5/6] memory: replace cpu_physical_memory_reset_dirty() with test-and-clear Stefan Hajnoczi
2014-12-02 11:23 ` [Qemu-devel] [RFC v2 6/6] memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).