* [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker.
@ 2010-03-16 10:53 Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION Yoshiaki Tamura
` (6 more replies)
0 siblings, 7 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi
The dirty and non-dirty pages are checked one by one in vl.c.
When the most of the memory is not dirty,
checking the dirty and non-dirty pages by multiple page size
should be much faster than checking them one by one.
We introduced bit-based phys_ram_dirty for VGA, CODE and MIGRATION, and
cpu_physical_memory_get_dirty_range() for this purpose.
This patch is based on the following discussion.
http://www.mail-archive.com/kvm@vger.kernel.org/msg28733.html
To prove our prospect, we have evaluated effect of this patch.
We compared runtime of ram_save_remaining with original
ram_save_remaining() and ram_save_remaining() using functions of this patch.
Test Environment:
CPU: 4x Intel Xeon Quad Core 2.66GHz
Mem size: 96GB
kvm version: 2.6.33
qemu-kvm version: commit 2b644fd0e737407133c88054ba498e772ce01f27
Host OS: CentOS (kernel 2.6.33)
Guest OS: Debian/GNU Linux lenny (kernel 2.6.26)
Guest Mem size: 512MB
Conditions of experiments are as follows:
Cond1: Guest OS periodically makes the 256MB continuous dirty pages.
Cond2: Guest OS periodically makes the 256MB dirty pages and non-dirty pages
in turn.
Cond3: Guest OS read 1GB file, which is bigger than memory.
Cond4: Guest OS write 1GB file, which is bigger than memory.
Experimental results:
Cond1: 1.9 ~ 61 times speed up
Cond2: 1.9 ~ 56 times speed up
Cond3: 1.9 ~ 59 times speed up
Cond4: 1.7 ~ 59 times speed up
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION.
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
@ 2010-03-16 10:53 ` Yoshiaki Tamura
2010-03-16 12:26 ` [Qemu-devel] " Avi Kivity
2010-03-16 10:53 ` [Qemu-devel] [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty Yoshiaki Tamura
` (5 subsequent siblings)
6 siblings, 1 reply; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi, Yoshiaki Tamura
Replaces byte-based phys_ram_dirty bitmap with
three bit-based phys_ram_dirty bitmap.
On allocation, it sets all bits in the bitmap.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
exec.c | 22 +++++++++++++++++-----
1 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/exec.c b/exec.c
index 9bcb4de..ba334e7 100644
--- a/exec.c
+++ b/exec.c
@@ -119,7 +119,9 @@ uint8_t *code_gen_ptr;
#if !defined(CONFIG_USER_ONLY)
int phys_ram_fd;
-uint8_t *phys_ram_dirty;
+unsigned long *phys_ram_vga_dirty;
+unsigned long *phys_ram_code_dirty;
+unsigned long *phys_ram_migration_dirty;
uint8_t *bios_mem;
static int in_migration;
@@ -2659,10 +2661,20 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
new_block->next = ram_blocks;
ram_blocks = new_block;
- phys_ram_dirty = qemu_realloc(phys_ram_dirty,
- (last_ram_offset + size) >> TARGET_PAGE_BITS);
- memset(phys_ram_dirty + (last_ram_offset >> TARGET_PAGE_BITS),
- 0xff, size >> TARGET_PAGE_BITS);
+ if (BITMAP_SIZE(last_ram_offset + size) != BITMAP_SIZE(last_ram_offset)) {
+ phys_ram_vga_dirty = qemu_realloc(phys_ram_vga_dirty,
+ BITMAP_SIZE(last_ram_offset + size));
+ phys_ram_code_dirty = qemu_realloc(phys_ram_code_dirty,
+ BITMAP_SIZE(last_ram_offset + size));
+ phys_ram_migration_dirty = qemu_realloc(phys_ram_migration_dirty,
+ BITMAP_SIZE(last_ram_offset + size));
+ memset((uint8_t *)phys_ram_vga_dirty +
+ BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+ memset((uint8_t *)phys_ram_code_dirty +
+ BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+ memset((uint8_t *)phys_ram_migration_dirty +
+ BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+ }
last_ram_offset += size;
--
1.7.0.31.g1df487
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION Yoshiaki Tamura
@ 2010-03-16 10:53 ` Yoshiaki Tamura
2010-03-16 12:45 ` [Qemu-devel] " Avi Kivity
2010-03-16 10:53 ` [Qemu-devel] [PATCH 3/6] qemu-kvm: Replace direct phys_ram_dirty access with wrapper functions Yoshiaki Tamura
` (4 subsequent siblings)
6 siblings, 1 reply; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi, Yoshiaki Tamura
Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
bit-based phys_ram_dirty bitmap, and adds more wrapper functions to prevent
direct access to the phys_ram_dirty bitmap.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
cpu-all.h | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 files changed, 90 insertions(+), 4 deletions(-)
diff --git a/cpu-all.h b/cpu-all.h
index 9bc01b9..91ec3e5 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -843,7 +843,9 @@ int cpu_str_to_log_mask(const char *str);
/* memory API */
extern int phys_ram_fd;
-extern uint8_t *phys_ram_dirty;
+extern unsigned long *phys_ram_vga_dirty;
+extern unsigned long *phys_ram_code_dirty;
+extern unsigned long *phys_ram_migration_dirty;
extern ram_addr_t ram_size;
extern ram_addr_t last_ram_offset;
extern uint8_t *bios_mem;
@@ -879,20 +881,104 @@ int cpu_memory_rw_debug(CPUState *env, target_ulong addr,
/* read dirty bit (return 0 or 1) */
static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
{
- return phys_ram_dirty[addr >> TARGET_PAGE_BITS] == 0xff;
+ unsigned long mask;
+ int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+ mask = 1UL << offset;
+ return (phys_ram_vga_dirty[index] &
+ phys_ram_code_dirty[index] &
+ phys_ram_migration_dirty[index] & mask) == mask;
+}
+
+static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
+{
+ unsigned long mask;
+ int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+ int ret = 0;
+
+ mask = 1UL << offset;
+ if (phys_ram_vga_dirty[index] & mask)
+ ret |= VGA_DIRTY_FLAG;
+ if (phys_ram_code_dirty[index] & mask)
+ ret |= CODE_DIRTY_FLAG;
+ if (phys_ram_migration_dirty[index] & mask)
+ ret |= MIGRATION_DIRTY_FLAG;
+
+ return ret;
}
static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
int dirty_flags)
{
- return phys_ram_dirty[addr >> TARGET_PAGE_BITS] & dirty_flags;
+ return cpu_physical_memory_get_dirty_flags(addr) & dirty_flags;
}
static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
{
- phys_ram_dirty[addr >> TARGET_PAGE_BITS] = 0xff;
+ unsigned long mask;
+ int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+ mask = 1UL << offset;
+ phys_ram_vga_dirty[index] |= mask;
+ phys_ram_code_dirty[index] |= mask;
+ phys_ram_migration_dirty[index] |= mask;
+}
+
+static inline void cpu_physical_memory_set_dirty_range(ram_addr_t addr,
+ unsigned long mask)
+{
+ int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+
+ phys_ram_vga_dirty[index] |= mask;
+ phys_ram_code_dirty[index] |= mask;
+ phys_ram_migration_dirty[index] |= mask;
}
+static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
+ int dirty_flags)
+{
+ unsigned long mask;
+ int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+ mask = 1UL << offset;
+ if (dirty_flags & VGA_DIRTY_FLAG)
+ phys_ram_vga_dirty[index] |= mask;
+ if (dirty_flags & CODE_DIRTY_FLAG)
+ phys_ram_code_dirty[index] |= mask;
+ if (dirty_flags & MIGRATION_DIRTY_FLAG)
+ phys_ram_migration_dirty[index] |= mask;
+}
+
+static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+ ram_addr_t addr = start;
+ unsigned long mask;
+ int index, offset, i;
+
+ for (i = 0; i < length; i += TARGET_PAGE_SIZE) {
+ index = ((addr + i) >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ offset = ((addr + i) >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+ mask = ~(1UL << offset);
+
+ if (dirty_flags & VGA_DIRTY_FLAG)
+ phys_ram_vga_dirty[index] &= mask;
+ if (dirty_flags & CODE_DIRTY_FLAG)
+ phys_ram_code_dirty[index] &= mask;
+ if (dirty_flags & MIGRATION_DIRTY_FLAG)
+ phys_ram_migration_dirty[index] &= mask;
+ }
+}
+
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end,
+ ram_addr_t *dirty_rams, int length,
+ int dirty_flags);
+
void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
int dirty_flags);
void cpu_tlb_update_dirty(CPUState *env);
--
1.7.0.31.g1df487
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 3/6] qemu-kvm: Replace direct phys_ram_dirty access with wrapper functions.
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty Yoshiaki Tamura
@ 2010-03-16 10:53 ` Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 4/6] qemu-kvm: Introduce cpu_physical_memory_get_dirty_range() Yoshiaki Tamura
` (3 subsequent siblings)
6 siblings, 0 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi, Yoshiaki Tamura
Replaces direct phys_ram_dirty access with wrapper functions to prevent
direct access to the phys_ram_dirty bitmap.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
exec.c | 45 ++++++++++++++++++++-------------------------
1 files changed, 20 insertions(+), 25 deletions(-)
diff --git a/exec.c b/exec.c
index ba334e7..b31c349 100644
--- a/exec.c
+++ b/exec.c
@@ -1946,7 +1946,7 @@ static void tlb_protect_code(ram_addr_t ram_addr)
static void tlb_unprotect_code_phys(CPUState *env, ram_addr_t ram_addr,
target_ulong vaddr)
{
- phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] |= CODE_DIRTY_FLAG;
+ cpu_physical_memory_set_dirty_flags(ram_addr, CODE_DIRTY_FLAG);
}
static inline void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry,
@@ -1967,8 +1967,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
{
CPUState *env;
unsigned long length, start1;
- int i, mask, len;
- uint8_t *p;
+ int i;
start &= TARGET_PAGE_MASK;
end = TARGET_PAGE_ALIGN(end);
@@ -1976,11 +1975,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
length = end - start;
if (length == 0)
return;
- len = length >> TARGET_PAGE_BITS;
- mask = ~dirty_flags;
- p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
- for(i = 0; i < len; i++)
- p[i] &= mask;
+ cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);
/* we modify the TLB cache so that the dirty bit will be set again
when accessing the range */
@@ -2837,16 +2832,16 @@ static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr,
uint32_t val)
{
int dirty_flags;
- dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
+ dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
if (!(dirty_flags & CODE_DIRTY_FLAG)) {
#if !defined(CONFIG_USER_ONLY)
tb_invalidate_phys_page_fast(ram_addr, 1);
- dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
+ dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
#endif
}
stb_p(qemu_get_ram_ptr(ram_addr), val);
dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
- phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
+ cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
/* we remove the notdirty callback only if the code has been
flushed */
if (dirty_flags == 0xff)
@@ -2857,16 +2852,16 @@ static void notdirty_mem_writew(void *opaque, target_phys_addr_t ram_addr,
uint32_t val)
{
int dirty_flags;
- dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
+ dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
if (!(dirty_flags & CODE_DIRTY_FLAG)) {
#if !defined(CONFIG_USER_ONLY)
tb_invalidate_phys_page_fast(ram_addr, 2);
- dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
+ dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
#endif
}
stw_p(qemu_get_ram_ptr(ram_addr), val);
dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
- phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
+ cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
/* we remove the notdirty callback only if the code has been
flushed */
if (dirty_flags == 0xff)
@@ -2877,16 +2872,16 @@ static void notdirty_mem_writel(void *opaque, target_phys_addr_t ram_addr,
uint32_t val)
{
int dirty_flags;
- dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
+ dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
if (!(dirty_flags & CODE_DIRTY_FLAG)) {
#if !defined(CONFIG_USER_ONLY)
tb_invalidate_phys_page_fast(ram_addr, 4);
- dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
+ dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
#endif
}
stl_p(qemu_get_ram_ptr(ram_addr), val);
dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
- phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
+ cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
/* we remove the notdirty callback only if the code has been
flushed */
if (dirty_flags == 0xff)
@@ -3337,8 +3332,8 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
/* invalidate code */
tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
/* set dirty bit */
- phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |=
- (0xff & ~CODE_DIRTY_FLAG);
+ cpu_physical_memory_set_dirty_flags(
+ addr1, (0xff & ~CODE_DIRTY_FLAG));
}
/* qemu doesn't execute guest code directly, but kvm does
therefore flush instruction caches */
@@ -3551,8 +3546,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
/* invalidate code */
tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
/* set dirty bit */
- phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |=
- (0xff & ~CODE_DIRTY_FLAG);
+ cpu_physical_memory_set_dirty_flags(
+ addr1, (0xff & ~CODE_DIRTY_FLAG));
}
addr1 += l;
access_len -= l;
@@ -3688,8 +3683,8 @@ void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val)
/* invalidate code */
tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
/* set dirty bit */
- phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |=
- (0xff & ~CODE_DIRTY_FLAG);
+ cpu_physical_memory_set_dirty_flags(
+ addr1, (0xff & ~CODE_DIRTY_FLAG));
}
}
}
@@ -3757,8 +3752,8 @@ void stl_phys(target_phys_addr_t addr, uint32_t val)
/* invalidate code */
tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
/* set dirty bit */
- phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |=
- (0xff & ~CODE_DIRTY_FLAG);
+ cpu_physical_memory_set_dirty_flags(addr1,
+ (0xff & ~CODE_DIRTY_FLAG));
}
}
}
--
1.7.0.31.g1df487
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 4/6] qemu-kvm: Introduce cpu_physical_memory_get_dirty_range().
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
` (2 preceding siblings ...)
2010-03-16 10:53 ` [Qemu-devel] [PATCH 3/6] qemu-kvm: Replace direct phys_ram_dirty access with wrapper functions Yoshiaki Tamura
@ 2010-03-16 10:53 ` Yoshiaki Tamura
2010-03-16 12:47 ` [Qemu-devel] " Avi Kivity
2010-03-16 10:53 ` [Qemu-devel] [PATCH 5/6] qemu-kvm: Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty Yoshiaki Tamura
` (2 subsequent siblings)
6 siblings, 1 reply; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi, Yoshiaki Tamura
Introduces cpu_physical_memory_get_dirty_range().
It checks the first row and puts dirty addr in the array.
If the first row is empty, it skips to the first non-dirty row
or the end addr, and put the length in the first entry of the array.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
exec.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 73 insertions(+), 0 deletions(-)
diff --git a/exec.c b/exec.c
index b31c349..87056a6 100644
--- a/exec.c
+++ b/exec.c
@@ -1961,6 +1961,79 @@ static inline void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry,
}
}
+/* It checks the first row and puts dirty addrs in the array.
+ If the first row is empty, it skips to the first non-dirty row
+ or the end addr, and put the length in the first entry of the array. */
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end,
+ ram_addr_t *dirty_rams, int length,
+ int dirty_flag)
+{
+ unsigned long phys_ram_dirty, page_number, *p;
+ ram_addr_t addr;
+ int s_idx = (start >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int e_idx = (end >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int i, j, offset;
+
+ switch (dirty_flag) {
+ case VGA_DIRTY_FLAG:
+ p = phys_ram_vga_dirty;
+ break;
+ case CODE_DIRTY_FLAG:
+ p = phys_ram_code_dirty;
+ break;
+ case MIGRATION_DIRTY_FLAG:
+ p = phys_ram_migration_dirty;
+ break;
+ default:
+ abort();
+ }
+
+ /* mask bits before the start addr */
+ offset = (start >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+ phys_ram_dirty = p[s_idx] & ~((1UL << offset) - 1);
+
+ if (s_idx == e_idx) {
+ /* mask bits after the end addr */
+ offset = (end >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+ phys_ram_dirty &= (1UL << offset) - 1;
+ }
+
+ if (phys_ram_dirty == 0) {
+ /* when the row is empty */
+ ram_addr_t skip;
+ if (s_idx == e_idx)
+ skip = end;
+ else {
+ /* skip empty rows */
+ while (s_idx < e_idx && p[++s_idx] == 0);
+ skip = (s_idx * HOST_LONG_BITS * TARGET_PAGE_SIZE);
+ }
+ dirty_rams[0] = skip - start;
+ i = 0;
+
+ } else if (phys_ram_dirty == ~0UL) {
+ /* when the row is fully dirtied */
+ addr = start;
+ for (i = 0; i < length; i++) {
+ dirty_rams[i] = addr;
+ addr += TARGET_PAGE_SIZE;
+ }
+ } else {
+ /* when the row is partially dirtied */
+ i = 0;
+ do {
+ j = ffsl(phys_ram_dirty) - 1;
+ phys_ram_dirty &= ~(1UL << j);
+ page_number = s_idx * HOST_LONG_BITS + j;
+ addr = page_number * TARGET_PAGE_SIZE;
+ dirty_rams[i] = addr;
+ i++;
+ } while (phys_ram_dirty != 0 && i < length);
+ }
+
+ return i;
+}
+
/* Note: start and end must be within the same ram block. */
void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
int dirty_flags)
--
1.7.0.31.g1df487
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 5/6] qemu-kvm: Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
` (3 preceding siblings ...)
2010-03-16 10:53 ` [Qemu-devel] [PATCH 4/6] qemu-kvm: Introduce cpu_physical_memory_get_dirty_range() Yoshiaki Tamura
@ 2010-03-16 10:53 ` Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 6/6] qemu-kvm: Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages Yoshiaki Tamura
2010-03-16 13:11 ` [Qemu-devel] Re: [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Avi Kivity
6 siblings, 0 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi, Yoshiaki Tamura
Modifies kvm_get_dirty_pages_log_range to use
cpu_physical_memory_set_dirty_range() to update the row of
the bit-based phys_ram_dirty bitmap at once.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
qemu-kvm.c | 19 ++++++-------------
1 files changed, 6 insertions(+), 13 deletions(-)
diff --git a/qemu-kvm.c b/qemu-kvm.c
index e417f21..75fa9b0 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -2305,9 +2305,8 @@ static int kvm_get_dirty_pages_log_range(unsigned long start_addr,
unsigned long offset,
unsigned long mem_size)
{
- unsigned int i, j;
- unsigned long page_number, addr, addr1, c;
- ram_addr_t ram_addr;
+ unsigned int i;
+ unsigned long page_number, addr, addr1;
unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) /
HOST_LONG_BITS;
@@ -2317,16 +2316,10 @@ static int kvm_get_dirty_pages_log_range(unsigned long start_addr,
*/
for (i = 0; i < len; i++) {
if (bitmap[i] != 0) {
- c = leul_to_cpu(bitmap[i]);
- do {
- j = ffsl(c) - 1;
- c &= ~(1ul << j);
- page_number = i * HOST_LONG_BITS + j;
- addr1 = page_number * TARGET_PAGE_SIZE;
- addr = offset + addr1;
- ram_addr = cpu_get_physical_page_desc(addr);
- cpu_physical_memory_set_dirty(ram_addr);
- } while (c != 0);
+ page_number = i * HOST_LONG_BITS;
+ addr1 = page_number * TARGET_PAGE_SIZE;
+ addr = offset + addr1;
+ cpu_physical_memory_set_dirty_range(addr, leul_to_cpu(bitmap[i]));
}
}
return 0;
--
1.7.0.31.g1df487
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 6/6] qemu-kvm: Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages.
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
` (4 preceding siblings ...)
2010-03-16 10:53 ` [Qemu-devel] [PATCH 5/6] qemu-kvm: Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty Yoshiaki Tamura
@ 2010-03-16 10:53 ` Yoshiaki Tamura
2010-03-16 13:11 ` [Qemu-devel] Re: [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Avi Kivity
6 siblings, 0 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 10:53 UTC (permalink / raw)
To: kvm, qemu-devel; +Cc: ohmura.kei, avi, Yoshiaki Tamura
Modifies ram_save_block() and ram_save_remaining() to use
cpu_physical_memory_get_dirty_range() to check multiple dirty and non-dirty
pages at once.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
vl.c | 55 +++++++++++++++++++++++++++++++++++--------------------
1 files changed, 35 insertions(+), 20 deletions(-)
diff --git a/vl.c b/vl.c
index 6e35cc6..e9ad7c9 100644
--- a/vl.c
+++ b/vl.c
@@ -2779,7 +2779,8 @@ static int ram_save_block(QEMUFile *f)
static ram_addr_t current_addr = 0;
ram_addr_t saved_addr = current_addr;
ram_addr_t addr = 0;
- int found = 0;
+ ram_addr_t dirty_rams[HOST_LONG_BITS];
+ int i, found = 0;
while (addr < last_ram_offset) {
if (kvm_enabled() && current_addr == 0) {
@@ -2791,28 +2792,35 @@ static int ram_save_block(QEMUFile *f)
return 0;
}
}
- if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
+ if ((found = cpu_physical_memory_get_dirty_range(
+ current_addr, last_ram_offset, dirty_rams, HOST_LONG_BITS,
+ MIGRATION_DIRTY_FLAG))) {
uint8_t *p;
- cpu_physical_memory_reset_dirty(current_addr,
- current_addr + TARGET_PAGE_SIZE,
- MIGRATION_DIRTY_FLAG);
+ for (i = 0; i < found; i++) {
+ ram_addr_t page_addr = dirty_rams[i];
+ cpu_physical_memory_reset_dirty(page_addr,
+ page_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
- p = qemu_get_ram_ptr(current_addr);
+ p = qemu_get_ram_ptr(page_addr);
- if (is_dup_page(p, *p)) {
- qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
- qemu_put_byte(f, *p);
- } else {
- qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
- qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+ if (is_dup_page(p, *p)) {
+ qemu_put_be64(f, (page_addr) |
+ RAM_SAVE_FLAG_COMPRESS);
+ qemu_put_byte(f, *p);
+ } else {
+ qemu_put_be64(f, (page_addr) |
+ RAM_SAVE_FLAG_PAGE);
+ qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+ }
}
-
- found = 1;
+
break;
+ } else {
+ addr += dirty_rams[0];
+ current_addr = (saved_addr + addr) % last_ram_offset;
}
- addr += TARGET_PAGE_SIZE;
- current_addr = (saved_addr + addr) % last_ram_offset;
}
return found;
@@ -2822,12 +2830,19 @@ static uint64_t bytes_transferred;
static ram_addr_t ram_save_remaining(void)
{
- ram_addr_t addr;
+ ram_addr_t addr = 0;
ram_addr_t count = 0;
+ ram_addr_t dirty_rams[HOST_LONG_BITS];
+ int found = 0;
- for (addr = 0; addr < last_ram_offset; addr += TARGET_PAGE_SIZE) {
- if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG))
- count++;
+ while (addr < last_ram_offset) {
+ if ((found = cpu_physical_memory_get_dirty_range(addr, last_ram_offset,
+ dirty_rams, HOST_LONG_BITS, MIGRATION_DIRTY_FLAG))) {
+ count += found;
+ addr = dirty_rams[found - 1] + TARGET_PAGE_SIZE;
+ } else {
+ addr += dirty_rams[0];
+ }
}
return count;
--
1.7.0.31.g1df487
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION.
2010-03-16 10:53 ` [Qemu-devel] [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION Yoshiaki Tamura
@ 2010-03-16 12:26 ` Avi Kivity
2010-03-16 13:01 ` Yoshiaki Tamura
0 siblings, 1 reply; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 12:26 UTC (permalink / raw)
To: Yoshiaki Tamura; +Cc: ohmura.kei, qemu-devel, kvm
On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
> Replaces byte-based phys_ram_dirty bitmap with
> three bit-based phys_ram_dirty bitmap.
> On allocation, it sets all bits in the bitmap.
>
> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
> ---
> exec.c | 22 +++++++++++++++++-----
> 1 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index 9bcb4de..ba334e7 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -119,7 +119,9 @@ uint8_t *code_gen_ptr;
>
> #if !defined(CONFIG_USER_ONLY)
> int phys_ram_fd;
> -uint8_t *phys_ram_dirty;
> +unsigned long *phys_ram_vga_dirty;
> +unsigned long *phys_ram_code_dirty;
> +unsigned long *phys_ram_migration_dirty;
>
Would be nice to make this an array.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 10:53 ` [Qemu-devel] [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty Yoshiaki Tamura
@ 2010-03-16 12:45 ` Avi Kivity
2010-03-16 13:17 ` Yoshiaki Tamura
2010-03-16 13:35 ` Anthony Liguori
0 siblings, 2 replies; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 12:45 UTC (permalink / raw)
To: Yoshiaki Tamura; +Cc: ohmura.kei, qemu-devel, kvm
On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to prevent
> direct access to the phys_ram_dirty bitmap.
>
> +
> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
> +{
> + unsigned long mask;
> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
> + int ret = 0;
> +
> + mask = 1UL<< offset;
> + if (phys_ram_vga_dirty[index]& mask)
> + ret |= VGA_DIRTY_FLAG;
> + if (phys_ram_code_dirty[index]& mask)
> + ret |= CODE_DIRTY_FLAG;
> + if (phys_ram_migration_dirty[index]& mask)
> + ret |= MIGRATION_DIRTY_FLAG;
> +
> + return ret;
> }
>
> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
> int dirty_flags)
> {
> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
> }
>
This turns one cacheline access into three. If the dirty bitmaps were
in an array, you could do
return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
BITS_IN_LONG)] & mask;
with one cacheline access.
>
> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
> {
> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
> + unsigned long mask;
> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
> +
> + mask = 1UL<< offset;
> + phys_ram_vga_dirty[index] |= mask;
> + phys_ram_code_dirty[index] |= mask;
> + phys_ram_migration_dirty[index] |= mask;
> +}
>
This is also three cacheline accesses. I think we should have a master
bitmap which is updated by set_dirty(), and which is or'ed into the
other bitmaps when they are accessed. At least the vga and migration
bitmaps are only read periodically, not randomly, so this would be very
fast. In a way, this is similar to how the qemu bitmap is updated from
the kvm bitmap today.
I am not sure about the code bitmap though.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 4/6] qemu-kvm: Introduce cpu_physical_memory_get_dirty_range().
2010-03-16 10:53 ` [Qemu-devel] [PATCH 4/6] qemu-kvm: Introduce cpu_physical_memory_get_dirty_range() Yoshiaki Tamura
@ 2010-03-16 12:47 ` Avi Kivity
0 siblings, 0 replies; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 12:47 UTC (permalink / raw)
To: Yoshiaki Tamura; +Cc: ohmura.kei, qemu-devel, kvm
On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
> Introduces cpu_physical_memory_get_dirty_range().
> It checks the first row and puts dirty addr in the array.
> If the first row is empty, it skips to the first non-dirty row
> or the end addr, and put the length in the first entry of the array.
>
>
>
> +/* It checks the first row and puts dirty addrs in the array.
> + If the first row is empty, it skips to the first non-dirty row
> + or the end addr, and put the length in the first entry of the array. */
> +int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end,
> + ram_addr_t *dirty_rams, int length,
> + int dirty_flag)
> +{
> + unsigned long phys_ram_dirty, page_number, *p;
> + ram_addr_t addr;
> + int s_idx = (start>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
> + int e_idx = (end>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
> + int i, j, offset;
> +
> + switch (dirty_flag) {
> + case VGA_DIRTY_FLAG:
> + p = phys_ram_vga_dirty;
> + break;
> + case CODE_DIRTY_FLAG:
> + p = phys_ram_code_dirty;
> + break;
> + case MIGRATION_DIRTY_FLAG:
> + p = phys_ram_migration_dirty;
> + break;
> + default:
> + abort();
> + }
>
This bit would be improved by switching to an array of bitmaps.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION.
2010-03-16 12:26 ` [Qemu-devel] " Avi Kivity
@ 2010-03-16 13:01 ` Yoshiaki Tamura
2010-03-16 13:04 ` Avi Kivity
0 siblings, 1 reply; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 13:01 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, qemu-devel, kvm
Avi Kivity wrote:
> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>> Replaces byte-based phys_ram_dirty bitmap with
>> three bit-based phys_ram_dirty bitmap.
>> On allocation, it sets all bits in the bitmap.
>>
>> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
>> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
>> ---
>> exec.c | 22 +++++++++++++++++-----
>> 1 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index 9bcb4de..ba334e7 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -119,7 +119,9 @@ uint8_t *code_gen_ptr;
>>
>> #if !defined(CONFIG_USER_ONLY)
>> int phys_ram_fd;
>> -uint8_t *phys_ram_dirty;
>> +unsigned long *phys_ram_vga_dirty;
>> +unsigned long *phys_ram_code_dirty;
>> +unsigned long *phys_ram_migration_dirty;
>
> Would be nice to make this an array.
Thanks for pointing out.
I have a question regarding the index of the array.
From the compatibility perspective, I would prefer using the existing macros.
#define VGA_DIRTY_FLAG 0x01
#define CODE_DIRTY_FLAG 0x02
#define MIGRATION_DIRTY_FLAG 0x08
However, if I use them as is, I'll get a sparse array...
Is it acceptable to change these values like 0, 1, 2?
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION.
2010-03-16 13:01 ` Yoshiaki Tamura
@ 2010-03-16 13:04 ` Avi Kivity
0 siblings, 0 replies; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 13:04 UTC (permalink / raw)
To: Yoshiaki Tamura; +Cc: ohmura.kei, qemu-devel, kvm
On 03/16/2010 03:01 PM, Yoshiaki Tamura wrote:
>>> -uint8_t *phys_ram_dirty;
>>> +unsigned long *phys_ram_vga_dirty;
>>> +unsigned long *phys_ram_code_dirty;
>>> +unsigned long *phys_ram_migration_dirty;
>>
>> Would be nice to make this an array.
>
>
> Thanks for pointing out.
> I have a question regarding the index of the array.
> From the compatibility perspective, I would prefer using the existing
> macros.
>
> #define VGA_DIRTY_FLAG 0x01
> #define CODE_DIRTY_FLAG 0x02
> #define MIGRATION_DIRTY_FLAG 0x08
>
> However, if I use them as is, I'll get a sparse array...
> Is it acceptable to change these values like 0, 1, 2?
Sure.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker.
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
` (5 preceding siblings ...)
2010-03-16 10:53 ` [Qemu-devel] [PATCH 6/6] qemu-kvm: Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages Yoshiaki Tamura
@ 2010-03-16 13:11 ` Avi Kivity
2010-03-16 13:41 ` Yoshiaki Tamura
6 siblings, 1 reply; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 13:11 UTC (permalink / raw)
To: Yoshiaki Tamura; +Cc: ohmura.kei, qemu-devel, kvm
On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
> Experimental results:
> Cond1: 1.9 ~ 61 times speed up
> Cond2: 1.9 ~ 56 times speed up
> Cond3: 1.9 ~ 59 times speed up
> Cond4: 1.7 ~ 59 times speed up
>
Impressive results. What's the typical speedup? Closer to 1.9 or 61?
Note the issue with the cache accesses for set_dirty() is only
applicable to tcg, since kvm always updates the dirty bitmap in a batch
(well, I/O also updates the bitmap).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 12:45 ` [Qemu-devel] " Avi Kivity
@ 2010-03-16 13:17 ` Yoshiaki Tamura
2010-03-16 13:29 ` Avi Kivity
2010-03-16 13:35 ` Anthony Liguori
1 sibling, 1 reply; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 13:17 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, qemu-devel, kvm
Avi Kivity wrote:
> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>> prevent
>> direct access to the phys_ram_dirty bitmap.
>
>> +
>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>> +{
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> + int ret = 0;
>> +
>> + mask = 1UL<< offset;
>> + if (phys_ram_vga_dirty[index]& mask)
>> + ret |= VGA_DIRTY_FLAG;
>> + if (phys_ram_code_dirty[index]& mask)
>> + ret |= CODE_DIRTY_FLAG;
>> + if (phys_ram_migration_dirty[index]& mask)
>> + ret |= MIGRATION_DIRTY_FLAG;
>> +
>> + return ret;
>> }
>>
>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>> int dirty_flags)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>> }
>
> This turns one cacheline access into three. If the dirty bitmaps were in
> an array, you could do
>
> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
> BITS_IN_LONG)] & mask;
>
> with one cacheline access.
If I'm understanding the existing code correctly,
int dirty_flags can be combined, like VGA + MIGRATION.
If we only have to worry about a single dirty flag, I agree with your idea.
On the other hand, qemu seems to require getting combined dirty flags.
If we introduce dirty bitmaps for each type, we need to access each bitmap to
get combined flags. I wasn't sure how to make this more efficient...
>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>> {
>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + phys_ram_vga_dirty[index] |= mask;
>> + phys_ram_code_dirty[index] |= mask;
>> + phys_ram_migration_dirty[index] |= mask;
>> +}
>
> This is also three cacheline accesses. I think we should have a master
> bitmap which is updated by set_dirty(), and which is or'ed into the
> other bitmaps when they are accessed. At least the vga and migration
> bitmaps are only read periodically, not randomly, so this would be very
> fast. In a way, this is similar to how the qemu bitmap is updated from
> the kvm bitmap today.
Sounds good to me.
So we're going to introduce 4 (VGA, CODE, MIGRATION, master) bit-based bitmaps
in total.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 13:17 ` Yoshiaki Tamura
@ 2010-03-16 13:29 ` Avi Kivity
2010-03-16 13:49 ` Yoshiaki Tamura
2010-03-16 13:51 ` Anthony Liguori
0 siblings, 2 replies; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 13:29 UTC (permalink / raw)
To: Yoshiaki Tamura; +Cc: ohmura.kei, qemu-devel, kvm
On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
> Avi Kivity wrote:
>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>> prevent
>>> direct access to the phys_ram_dirty bitmap.
>>
>>> +
>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>>> +{
>>> + unsigned long mask;
>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>> + int ret = 0;
>>> +
>>> + mask = 1UL<< offset;
>>> + if (phys_ram_vga_dirty[index]& mask)
>>> + ret |= VGA_DIRTY_FLAG;
>>> + if (phys_ram_code_dirty[index]& mask)
>>> + ret |= CODE_DIRTY_FLAG;
>>> + if (phys_ram_migration_dirty[index]& mask)
>>> + ret |= MIGRATION_DIRTY_FLAG;
>>> +
>>> + return ret;
>>> }
>>>
>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>> int dirty_flags)
>>> {
>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>> }
>>
>> This turns one cacheline access into three. If the dirty bitmaps were in
>> an array, you could do
>>
>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>> BITS_IN_LONG)] & mask;
>>
>> with one cacheline access.
>
> If I'm understanding the existing code correctly,
> int dirty_flags can be combined, like VGA + MIGRATION.
> If we only have to worry about a single dirty flag, I agree with your
> idea.
From a quick grep it seems flags are not combined, except for something
strange with CODE_DIRTY_FLAG:
> static void notdirty_mem_writel(void *opaque, target_phys_addr_t ram_addr,
> uint32_t val)
> {
> int dirty_flags;
> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
> if (!(dirty_flags & CODE_DIRTY_FLAG)) {
> #if !defined(CONFIG_USER_ONLY)
> tb_invalidate_phys_page_fast(ram_addr, 4);
> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
> #endif
> }
> stl_p(qemu_get_ram_ptr(ram_addr), val);
> dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
> phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
> /* we remove the notdirty callback only if the code has been
> flushed */
> if (dirty_flags == 0xff)
> tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
> }
I can't say I understand what it does.
>
> On the other hand, qemu seems to require getting combined dirty flags.
> If we introduce dirty bitmaps for each type, we need to access each
> bitmap to get combined flags. I wasn't sure how to make this more
> efficient...
>
>>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>>> {
>>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
>>> + unsigned long mask;
>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>> +
>>> + mask = 1UL<< offset;
>>> + phys_ram_vga_dirty[index] |= mask;
>>> + phys_ram_code_dirty[index] |= mask;
>>> + phys_ram_migration_dirty[index] |= mask;
>>> +}
>>
>> This is also three cacheline accesses. I think we should have a master
>> bitmap which is updated by set_dirty(), and which is or'ed into the
>> other bitmaps when they are accessed. At least the vga and migration
>> bitmaps are only read periodically, not randomly, so this would be very
>> fast. In a way, this is similar to how the qemu bitmap is updated from
>> the kvm bitmap today.
>
> Sounds good to me.
> So we're going to introduce 4 (VGA, CODE, MIGRATION, master) bit-based
> bitmaps in total.
>
Yeah, except CODE doesn't behave like the others. Would be best to
understand what it's requirements are before making the change. Maybe
CODE will need separate handling (so master will only feed VGA and
MIGRATION).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 12:45 ` [Qemu-devel] " Avi Kivity
2010-03-16 13:17 ` Yoshiaki Tamura
@ 2010-03-16 13:35 ` Anthony Liguori
2010-03-16 22:50 ` Yoshiaki Tamura
1 sibling, 1 reply; 28+ messages in thread
From: Anthony Liguori @ 2010-03-16 13:35 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, Yoshiaki Tamura, kvm, qemu-devel
On 03/16/2010 07:45 AM, Avi Kivity wrote:
> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>> prevent
>> direct access to the phys_ram_dirty bitmap.
>
>> +
>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>> +{
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> + int ret = 0;
>> +
>> + mask = 1UL<< offset;
>> + if (phys_ram_vga_dirty[index]& mask)
>> + ret |= VGA_DIRTY_FLAG;
>> + if (phys_ram_code_dirty[index]& mask)
>> + ret |= CODE_DIRTY_FLAG;
>> + if (phys_ram_migration_dirty[index]& mask)
>> + ret |= MIGRATION_DIRTY_FLAG;
>> +
>> + return ret;
>> }
>>
>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>> int dirty_flags)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>> }
>
> This turns one cacheline access into three. If the dirty bitmaps were
> in an array, you could do
>
> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
> BITS_IN_LONG)] & mask;
>
> with one cacheline access.
As far as I can tell, we only ever call with a single flag so your
suggestion makes sense.
I'd suggest introducing these functions before splitting the bitmap up.
It makes review a bit easier.
>>
>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>> {
>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + phys_ram_vga_dirty[index] |= mask;
>> + phys_ram_code_dirty[index] |= mask;
>> + phys_ram_migration_dirty[index] |= mask;
>> +}
>
> This is also three cacheline accesses. I think we should have a
> master bitmap which is updated by set_dirty(), and which is or'ed into
> the other bitmaps when they are accessed. At least the vga and
> migration bitmaps are only read periodically, not randomly, so this
> would be very fast. In a way, this is similar to how the qemu bitmap
> is updated from the kvm bitmap today.
>
> I am not sure about the code bitmap though.
I think your suggestion makes sense and would also work for the code bitmap.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker.
2010-03-16 13:11 ` [Qemu-devel] Re: [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Avi Kivity
@ 2010-03-16 13:41 ` Yoshiaki Tamura
0 siblings, 0 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 13:41 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, qemu-devel, kvm
Avi Kivity wrote:
> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>> Experimental results:
>> Cond1: 1.9 ~ 61 times speed up
>> Cond2: 1.9 ~ 56 times speed up
>> Cond3: 1.9 ~ 59 times speed up
>> Cond4: 1.7 ~ 59 times speed up
>
> Impressive results. What's the typical speedup? Closer to 1.9 or 61?
To be honest, I thought the result above was too vague...
The speed up grows when the number of dirty pages decreases.
Let me paste the snipped actual data measured during live migration on Cond1.
This result is measured with cpu_get_real_ticks(), so the values should be in
raw ticks.
135200 dirty pages: orig.2488419, bitbased.1251171, ratio.1.99
...
98346 dirty pages: orig.3580533, bitbased.1386918, ratio.2.58
...
54865 dirty pages: orig.4220865, bitbased.984924, ratio.4.29
...
27883 dirty pages: orig.4088970, bitbased.514602, ratio.7.95
...
11541 dirty pages: orig.3854277, bitbased.220410, ratio.17.49
...
8117 dirty pages: orig.4041765, bitbased.175446, ratio.23.04
3231 dirty pages: orig.3337083, bitbased.105921, ratio.31.51
2401 dirty pages: orig.4103469, bitbased.89406, ratio.45.90
1595 dirty pages: orig.4028949, bitbased.78570, ratio.51.28
756 dirty pages: orig.4036707, bitbased.67662, ratio.59.66
0 dirty pages: orig.3938085, bitbased.23634, ratio.166.63
0 dirty pages: orig.3968163, bitbased.23526, ratio.168.67
We didn't show the data for checking completely empty bitmap because it was too
fast and didn't wan't to get wrong impression.
> Note the issue with the cache accesses for set_dirty() is only
> applicable to tcg, since kvm always updates the dirty bitmap in a batch
> (well, I/O also updates the bitmap).
I understand.
I'm still concerned regarding the way of reseting the dirty bitmap.
I was thinking to reset them in a batch, but it seems difficult because of the
consistency with the tlb.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 13:29 ` Avi Kivity
@ 2010-03-16 13:49 ` Yoshiaki Tamura
2010-03-16 13:51 ` Anthony Liguori
1 sibling, 0 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 13:49 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, qemu-devel, kvm
Avi Kivity wrote:
> On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
>> Avi Kivity wrote:
>>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>>> prevent
>>>> direct access to the phys_ram_dirty bitmap.
>>>
>>>> +
>>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>>>> +{
>>>> + unsigned long mask;
>>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>>> + int ret = 0;
>>>> +
>>>> + mask = 1UL<< offset;
>>>> + if (phys_ram_vga_dirty[index]& mask)
>>>> + ret |= VGA_DIRTY_FLAG;
>>>> + if (phys_ram_code_dirty[index]& mask)
>>>> + ret |= CODE_DIRTY_FLAG;
>>>> + if (phys_ram_migration_dirty[index]& mask)
>>>> + ret |= MIGRATION_DIRTY_FLAG;
>>>> +
>>>> + return ret;
>>>> }
>>>>
>>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>>> int dirty_flags)
>>>> {
>>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>>> }
>>>
>>> This turns one cacheline access into three. If the dirty bitmaps were in
>>> an array, you could do
>>>
>>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>>> BITS_IN_LONG)] & mask;
>>>
>>> with one cacheline access.
>>
>> If I'm understanding the existing code correctly,
>> int dirty_flags can be combined, like VGA + MIGRATION.
>> If we only have to worry about a single dirty flag, I agree with your
>> idea.
>
> From a quick grep it seems flags are not combined, except for something
> strange with CODE_DIRTY_FLAG:
Thanks for checking out.
But the CODE_DIRTY_FLAG makes me really nervous...
>> static void notdirty_mem_writel(void *opaque, target_phys_addr_t
>> ram_addr,
>> uint32_t val)
>> {
>> int dirty_flags;
>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>> if (!(dirty_flags & CODE_DIRTY_FLAG)) {
>> #if !defined(CONFIG_USER_ONLY)
>> tb_invalidate_phys_page_fast(ram_addr, 4);
>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>> #endif
>> }
>> stl_p(qemu_get_ram_ptr(ram_addr), val);
>> dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
>> phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
>> /* we remove the notdirty callback only if the code has been
>> flushed */
>> if (dirty_flags == 0xff)
>> tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
>> }
>
> I can't say I understand what it does.
Me neither.
This the reason I had to take naive approach...
>> On the other hand, qemu seems to require getting combined dirty flags.
>> If we introduce dirty bitmaps for each type, we need to access each
>> bitmap to get combined flags. I wasn't sure how to make this more
>> efficient...
>>
>>>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>>>> {
>>>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
>>>> + unsigned long mask;
>>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>>> +
>>>> + mask = 1UL<< offset;
>>>> + phys_ram_vga_dirty[index] |= mask;
>>>> + phys_ram_code_dirty[index] |= mask;
>>>> + phys_ram_migration_dirty[index] |= mask;
>>>> +}
>>>
>>> This is also three cacheline accesses. I think we should have a master
>>> bitmap which is updated by set_dirty(), and which is or'ed into the
>>> other bitmaps when they are accessed. At least the vga and migration
>>> bitmaps are only read periodically, not randomly, so this would be very
>>> fast. In a way, this is similar to how the qemu bitmap is updated from
>>> the kvm bitmap today.
>>
>> Sounds good to me.
>> So we're going to introduce 4 (VGA, CODE, MIGRATION, master) bit-based
>> bitmaps in total.
>>
>
> Yeah, except CODE doesn't behave like the others. Would be best to
> understand what it's requirements are before making the change. Maybe
> CODE will need separate handling (so master will only feed VGA and
> MIGRATION).
After implementing this patch set, I thought separating the wrapper functions
for each dirty flag type might be an option. Unifying everything makes
inefficient here. But anyway, do you know somebody who has a strong insight on
this CODE_DIRTY_FLAG?
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 13:29 ` Avi Kivity
2010-03-16 13:49 ` Yoshiaki Tamura
@ 2010-03-16 13:51 ` Anthony Liguori
2010-03-16 13:57 ` Avi Kivity
1 sibling, 1 reply; 28+ messages in thread
From: Anthony Liguori @ 2010-03-16 13:51 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, Yoshiaki Tamura, kvm, qemu-devel
On 03/16/2010 08:29 AM, Avi Kivity wrote:
> On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
>> Avi Kivity wrote:
>>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>>> prevent
>>>> direct access to the phys_ram_dirty bitmap.
>>>
>>>> +
>>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t
>>>> addr)
>>>> +{
>>>> + unsigned long mask;
>>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>>> + int ret = 0;
>>>> +
>>>> + mask = 1UL<< offset;
>>>> + if (phys_ram_vga_dirty[index]& mask)
>>>> + ret |= VGA_DIRTY_FLAG;
>>>> + if (phys_ram_code_dirty[index]& mask)
>>>> + ret |= CODE_DIRTY_FLAG;
>>>> + if (phys_ram_migration_dirty[index]& mask)
>>>> + ret |= MIGRATION_DIRTY_FLAG;
>>>> +
>>>> + return ret;
>>>> }
>>>>
>>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>>> int dirty_flags)
>>>> {
>>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>>> }
>>>
>>> This turns one cacheline access into three. If the dirty bitmaps
>>> were in
>>> an array, you could do
>>>
>>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>>> BITS_IN_LONG)] & mask;
>>>
>>> with one cacheline access.
>>
>> If I'm understanding the existing code correctly,
>> int dirty_flags can be combined, like VGA + MIGRATION.
>> If we only have to worry about a single dirty flag, I agree with your
>> idea.
>
> From a quick grep it seems flags are not combined, except for
> something strange with CODE_DIRTY_FLAG:
>
>> static void notdirty_mem_writel(void *opaque, target_phys_addr_t
>> ram_addr,
>> uint32_t val)
>> {
>> int dirty_flags;
>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>> if (!(dirty_flags & CODE_DIRTY_FLAG)) {
>> #if !defined(CONFIG_USER_ONLY)
>> tb_invalidate_phys_page_fast(ram_addr, 4);
>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>> #endif
>> }
>> stl_p(qemu_get_ram_ptr(ram_addr), val);
>> dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
>> phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
>> /* we remove the notdirty callback only if the code has been
>> flushed */
>> if (dirty_flags == 0xff)
>> tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
>> }
>
> I can't say I understand what it does.
The semantics of CODE_DIRTY_FLAG are a little counter intuitive.
CODE_DIRTY_FLAG means that we know that something isn't code so writes
do not need checking for self modifying code.
notdirty_mem_write() is called for any ram that is in the virtual TLB
that has not been updated yet and once a write has occurred, we can
switch to faster access functions (provided we've invalidated any
translation blocks).
That's why the check is if (!(dirty_flags & CODE_DIRTY_FLAG)), if it
hasn't been set yet, we have to assume that it could be a TB so we need
to invalidate it. tb_invalidate_phys_page_fast() will set the
CODE_DIRTY_FLAG if no code is present in that memory area which is why
we fetch dirty_flags again.
We do the store, and then set the dirty bits to mark that the page is
now dirty taking care to not change the CODE_DIRTY_FLAG bit.
At the very end, we check to see if CODE_DIRTY_FLAG which indicates that
we no longer need to trap writes. If so, we call tlb_set_dirty() which
will ultimately remove the notdirty callback in favor of a faster access
mechanism.
With respect patch series, there should be no problem having a separate
code bitmap that gets updated along with a main bitmap provided that the
semantics of CODE_DIRTY_FLAG are preserved.
>> Sounds good to me.
>> So we're going to introduce 4 (VGA, CODE, MIGRATION, master)
>> bit-based bitmaps in total.
>>
>
> Yeah, except CODE doesn't behave like the others. Would be best to
> understand what it's requirements are before making the change. Maybe
> CODE will need separate handling (so master will only feed VGA and
> MIGRATION).
Generally speaking, cpu_physical_memory_set_dirty() is called by the
device model. Any writes by the device model that results in
self-modifying code are not going to have predictable semantics which is
why it can set CODE_DIRTY_FLAG.
CODE_DIRTY_FLAG doesn't need to get updated from a master bitmap. It
should be treated as a separate bitmap that is strictly dealt with by
the virtual TLB.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 13:51 ` Anthony Liguori
@ 2010-03-16 13:57 ` Avi Kivity
2010-03-16 14:50 ` Anthony Liguori
0 siblings, 1 reply; 28+ messages in thread
From: Avi Kivity @ 2010-03-16 13:57 UTC (permalink / raw)
To: Anthony Liguori; +Cc: ohmura.kei, Yoshiaki Tamura, kvm, qemu-devel
On 03/16/2010 03:51 PM, Anthony Liguori wrote:
> On 03/16/2010 08:29 AM, Avi Kivity wrote:
>> On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
>>> Avi Kivity wrote:
>>>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>>>> prevent
>>>>> direct access to the phys_ram_dirty bitmap.
>>>>
>>>>> +
>>>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t
>>>>> addr)
>>>>> +{
>>>>> + unsigned long mask;
>>>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>>>> + int ret = 0;
>>>>> +
>>>>> + mask = 1UL<< offset;
>>>>> + if (phys_ram_vga_dirty[index]& mask)
>>>>> + ret |= VGA_DIRTY_FLAG;
>>>>> + if (phys_ram_code_dirty[index]& mask)
>>>>> + ret |= CODE_DIRTY_FLAG;
>>>>> + if (phys_ram_migration_dirty[index]& mask)
>>>>> + ret |= MIGRATION_DIRTY_FLAG;
>>>>> +
>>>>> + return ret;
>>>>> }
>>>>>
>>>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>>>> int dirty_flags)
>>>>> {
>>>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>>>> }
>>>>
>>>> This turns one cacheline access into three. If the dirty bitmaps
>>>> were in
>>>> an array, you could do
>>>>
>>>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>>>> BITS_IN_LONG)] & mask;
>>>>
>>>> with one cacheline access.
>>>
>>> If I'm understanding the existing code correctly,
>>> int dirty_flags can be combined, like VGA + MIGRATION.
>>> If we only have to worry about a single dirty flag, I agree with
>>> your idea.
>>
>> From a quick grep it seems flags are not combined, except for
>> something strange with CODE_DIRTY_FLAG:
>>
>>> static void notdirty_mem_writel(void *opaque, target_phys_addr_t
>>> ram_addr,
>>> uint32_t val)
>>> {
>>> int dirty_flags;
>>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>>> if (!(dirty_flags & CODE_DIRTY_FLAG)) {
>>> #if !defined(CONFIG_USER_ONLY)
>>> tb_invalidate_phys_page_fast(ram_addr, 4);
>>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>>> #endif
>>> }
>>> stl_p(qemu_get_ram_ptr(ram_addr), val);
>>> dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
>>> phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
>>> /* we remove the notdirty callback only if the code has been
>>> flushed */
>>> if (dirty_flags == 0xff)
>>> tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
>>> }
>>
>> I can't say I understand what it does.
>
> The semantics of CODE_DIRTY_FLAG are a little counter intuitive.
> CODE_DIRTY_FLAG means that we know that something isn't code so writes
> do not need checking for self modifying code.
So the hardware equivalent is, when the Instruction TLB loads a page
address, clear CODE_DIRTY_FLAG?
>
> notdirty_mem_write() is called for any ram that is in the virtual TLB
> that has not been updated yet and once a write has occurred, we can
> switch to faster access functions (provided we've invalidated any
> translation blocks).
>
> That's why the check is if (!(dirty_flags & CODE_DIRTY_FLAG)), if it
> hasn't been set yet, we have to assume that it could be a TB so we
> need to invalidate it. tb_invalidate_phys_page_fast() will set the
> CODE_DIRTY_FLAG if no code is present in that memory area which is why
> we fetch dirty_flags again.
Ok.
>
> We do the store, and then set the dirty bits to mark that the page is
> now dirty taking care to not change the CODE_DIRTY_FLAG bit.
>
> At the very end, we check to see if CODE_DIRTY_FLAG which indicates
> that we no longer need to trap writes. If so, we call tlb_set_dirty()
> which will ultimately remove the notdirty callback in favor of a
> faster access mechanism.
>
> With respect patch series, there should be no problem having a
> separate code bitmap that gets updated along with a main bitmap
> provided that the semantics of CODE_DIRTY_FLAG are preserved.
>
>>> Sounds good to me.
>>> So we're going to introduce 4 (VGA, CODE, MIGRATION, master)
>>> bit-based bitmaps in total.
>>>
>>
>> Yeah, except CODE doesn't behave like the others. Would be best to
>> understand what it's requirements are before making the change.
>> Maybe CODE will need separate handling (so master will only feed VGA
>> and MIGRATION).
>
> Generally speaking, cpu_physical_memory_set_dirty() is called by the
> device model. Any writes by the device model that results in
> self-modifying code are not going to have predictable semantics which
> is why it can set CODE_DIRTY_FLAG.
>
> CODE_DIRTY_FLAG doesn't need to get updated from a master bitmap. It
> should be treated as a separate bitmap that is strictly dealt with by
> the virtual TLB.
Thanks.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 13:57 ` Avi Kivity
@ 2010-03-16 14:50 ` Anthony Liguori
2010-03-16 20:10 ` Blue Swirl
0 siblings, 1 reply; 28+ messages in thread
From: Anthony Liguori @ 2010-03-16 14:50 UTC (permalink / raw)
To: Avi Kivity; +Cc: ohmura.kei, Yoshiaki Tamura, kvm, qemu-devel
On 03/16/2010 08:57 AM, Avi Kivity wrote:
> On 03/16/2010 03:51 PM, Anthony Liguori wrote:
>> On 03/16/2010 08:29 AM, Avi Kivity wrote:
>>> On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
>>>> Avi Kivity wrote:
>>>>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>>>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>>>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>>>>> prevent
>>>>>> direct access to the phys_ram_dirty bitmap.
>>>>>
>>>>>> +
>>>>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t
>>>>>> addr)
>>>>>> +{
>>>>>> + unsigned long mask;
>>>>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>>>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>>>>> + int ret = 0;
>>>>>> +
>>>>>> + mask = 1UL<< offset;
>>>>>> + if (phys_ram_vga_dirty[index]& mask)
>>>>>> + ret |= VGA_DIRTY_FLAG;
>>>>>> + if (phys_ram_code_dirty[index]& mask)
>>>>>> + ret |= CODE_DIRTY_FLAG;
>>>>>> + if (phys_ram_migration_dirty[index]& mask)
>>>>>> + ret |= MIGRATION_DIRTY_FLAG;
>>>>>> +
>>>>>> + return ret;
>>>>>> }
>>>>>>
>>>>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>>>>> int dirty_flags)
>>>>>> {
>>>>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>>>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>>>>> }
>>>>>
>>>>> This turns one cacheline access into three. If the dirty bitmaps
>>>>> were in
>>>>> an array, you could do
>>>>>
>>>>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>>>>> BITS_IN_LONG)] & mask;
>>>>>
>>>>> with one cacheline access.
>>>>
>>>> If I'm understanding the existing code correctly,
>>>> int dirty_flags can be combined, like VGA + MIGRATION.
>>>> If we only have to worry about a single dirty flag, I agree with
>>>> your idea.
>>>
>>> From a quick grep it seems flags are not combined, except for
>>> something strange with CODE_DIRTY_FLAG:
>>>
>>>> static void notdirty_mem_writel(void *opaque, target_phys_addr_t
>>>> ram_addr,
>>>> uint32_t val)
>>>> {
>>>> int dirty_flags;
>>>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>>>> if (!(dirty_flags & CODE_DIRTY_FLAG)) {
>>>> #if !defined(CONFIG_USER_ONLY)
>>>> tb_invalidate_phys_page_fast(ram_addr, 4);
>>>> dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>>>> #endif
>>>> }
>>>> stl_p(qemu_get_ram_ptr(ram_addr), val);
>>>> dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
>>>> phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
>>>> /* we remove the notdirty callback only if the code has been
>>>> flushed */
>>>> if (dirty_flags == 0xff)
>>>> tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
>>>> }
>>>
>>> I can't say I understand what it does.
>>
>> The semantics of CODE_DIRTY_FLAG are a little counter intuitive.
>> CODE_DIRTY_FLAG means that we know that something isn't code so
>> writes do not need checking for self modifying code.
>
> So the hardware equivalent is, when the Instruction TLB loads a page
> address, clear CODE_DIRTY_FLAG?
Yes, and is what tlb_protect_code() does and it's called from
tb_alloc_page() which is what's code when a TB is created.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 14:50 ` Anthony Liguori
@ 2010-03-16 20:10 ` Blue Swirl
2010-03-16 22:31 ` Richard Henderson
2010-03-17 4:07 ` Avi Kivity
0 siblings, 2 replies; 28+ messages in thread
From: Blue Swirl @ 2010-03-16 20:10 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel, ohmura.kei, Avi Kivity, kvm, Yoshiaki Tamura
On 3/16/10, Anthony Liguori <anthony@codemonkey.ws> wrote:
> On 03/16/2010 08:57 AM, Avi Kivity wrote:
>
> > On 03/16/2010 03:51 PM, Anthony Liguori wrote:
> >
> > > On 03/16/2010 08:29 AM, Avi Kivity wrote:
> > >
> > > > On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
> > > >
> > > > > Avi Kivity wrote:
> > > > >
> > > > > > On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
> > > > > >
> > > > > > > Modifies wrapper functions for byte-based phys_ram_dirty bitmap
> to
> > > > > > > bit-based phys_ram_dirty bitmap, and adds more wrapper functions
> to
> > > > > > > prevent
> > > > > > > direct access to the phys_ram_dirty bitmap.
> > > > > > >
> > > > > >
> > > > > >
> > > > > > > +
> > > > > > > +static inline int
> cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
> > > > > > > +{
> > > > > > > + unsigned long mask;
> > > > > > > + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
> > > > > > > + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
> > > > > > > + int ret = 0;
> > > > > > > +
> > > > > > > + mask = 1UL<< offset;
> > > > > > > + if (phys_ram_vga_dirty[index]& mask)
> > > > > > > + ret |= VGA_DIRTY_FLAG;
> > > > > > > + if (phys_ram_code_dirty[index]& mask)
> > > > > > > + ret |= CODE_DIRTY_FLAG;
> > > > > > > + if (phys_ram_migration_dirty[index]& mask)
> > > > > > > + ret |= MIGRATION_DIRTY_FLAG;
> > > > > > > +
> > > > > > > + return ret;
> > > > > > > }
> > > > > > >
> > > > > > > static inline int
> cpu_physical_memory_get_dirty(ram_addr_t addr,
> > > > > > > int dirty_flags)
> > > > > > > {
> > > > > > > - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
> > > > > > > + return
> cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
> > > > > > > }
> > > > > > >
> > > > > >
> > > > > > This turns one cacheline access into three. If the dirty bitmaps
> were in
> > > > > > an array, you could do
> > > > > >
> > > > > > return dirty_bitmaps[dirty_index][addr >>
> (TARGET_PAGE_BITS +
> > > > > > BITS_IN_LONG)] & mask;
> > > > > >
> > > > > > with one cacheline access.
> > > > > >
> > > > >
> > > > > If I'm understanding the existing code correctly,
> > > > > int dirty_flags can be combined, like VGA + MIGRATION.
> > > > > If we only have to worry about a single dirty flag, I agree with
> your idea.
> > > > >
> > > >
> > > > From a quick grep it seems flags are not combined, except for
> something strange with CODE_DIRTY_FLAG:
> > > >
> > > >
> > > > > static void notdirty_mem_writel(void *opaque,
> target_phys_addr_t ram_addr,
> > > > > uint32_t val)
> > > > > {
> > > > > int dirty_flags;
> > > > > dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
> > > > > if (!(dirty_flags & CODE_DIRTY_FLAG)) {
> > > > > #if !defined(CONFIG_USER_ONLY)
> > > > > tb_invalidate_phys_page_fast(ram_addr, 4);
> > > > > dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
> > > > > #endif
> > > > > }
> > > > > stl_p(qemu_get_ram_ptr(ram_addr), val);
> > > > > dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
> > > > > phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
> > > > > /* we remove the notdirty callback only if the code has been
> > > > > flushed */
> > > > > if (dirty_flags == 0xff)
> > > > > tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
> > > > > }
> > > > >
> > > >
> > > > I can't say I understand what it does.
> > > >
> > >
> > > The semantics of CODE_DIRTY_FLAG are a little counter intuitive.
> CODE_DIRTY_FLAG means that we know that something isn't code so writes do
> not need checking for self modifying code.
> > >
> >
> > So the hardware equivalent is, when the Instruction TLB loads a page
> address, clear CODE_DIRTY_FLAG?
> >
>
> Yes, and is what tlb_protect_code() does and it's called from
> tb_alloc_page() which is what's code when a TB is created.
Just a tangential note: a long time ago, I tried to disable self
modifying code detection for Sparc. On most RISC architectures, SMC
needs explicit flushing so in theory we need not track code memory
writes. However, during exceptions the translator needs to access the
original unmodified code that was used to generate the TB. But maybe
there are other ways to avoid SMC tracking, on x86 it's still needed
but I suppose SMC is pretty rare.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 20:10 ` Blue Swirl
@ 2010-03-16 22:31 ` Richard Henderson
2010-03-17 0:05 ` Paul Brook
2010-03-17 4:07 ` Avi Kivity
1 sibling, 1 reply; 28+ messages in thread
From: Richard Henderson @ 2010-03-16 22:31 UTC (permalink / raw)
To: Blue Swirl; +Cc: ohmura.kei, kvm, Yoshiaki Tamura, qemu-devel, Avi Kivity
On 03/16/2010 01:10 PM, Blue Swirl wrote:
> Just a tangential note: a long time ago, I tried to disable self
> modifying code detection for Sparc. On most RISC architectures, SMC
> needs explicit flushing so in theory we need not track code memory
> writes. However, during exceptions the translator needs to access the
> original unmodified code that was used to generate the TB. But maybe
> there are other ways to avoid SMC tracking, on x86 it's still needed
> but I suppose SMC is pretty rare.
True SMC is fairly rare, but the SMC checker triggers fairly often
on the PLT update during dynamic linking. Nearly all cpus (x86 being
the only exception I recall) needed to re-design their PLT format to
avoid this code update in order to support SELinux.
Where does the translator need access to this original code? I was
just thinking about this problem today, wondering how much overhead
there is with this SMC page protection thing.
r~
^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 13:35 ` Anthony Liguori
@ 2010-03-16 22:50 ` Yoshiaki Tamura
0 siblings, 0 replies; 28+ messages in thread
From: Yoshiaki Tamura @ 2010-03-16 22:50 UTC (permalink / raw)
To: Anthony Liguori; +Cc: ohmura.kei, Avi Kivity, kvm, qemu-devel
Anthony Liguori wrote:
> On 03/16/2010 07:45 AM, Avi Kivity wrote:
>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>> prevent
>>> direct access to the phys_ram_dirty bitmap.
>>
>>> +
>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>>> +{
>>> + unsigned long mask;
>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>> + int ret = 0;
>>> +
>>> + mask = 1UL<< offset;
>>> + if (phys_ram_vga_dirty[index]& mask)
>>> + ret |= VGA_DIRTY_FLAG;
>>> + if (phys_ram_code_dirty[index]& mask)
>>> + ret |= CODE_DIRTY_FLAG;
>>> + if (phys_ram_migration_dirty[index]& mask)
>>> + ret |= MIGRATION_DIRTY_FLAG;
>>> +
>>> + return ret;
>>> }
>>>
>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>> int dirty_flags)
>>> {
>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>> }
>>
>> This turns one cacheline access into three. If the dirty bitmaps were
>> in an array, you could do
>>
>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>> BITS_IN_LONG)] & mask;
>>
>> with one cacheline access.
>
> As far as I can tell, we only ever call with a single flag so your
> suggestion makes sense.
>
> I'd suggest introducing these functions before splitting the bitmap up.
> It makes review a bit easier.
Thanks for your advise.
I'll post the wrapper functions for existing byte-based phys_ram_dirty first.
Yoshi
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 22:31 ` Richard Henderson
@ 2010-03-17 0:05 ` Paul Brook
0 siblings, 0 replies; 28+ messages in thread
From: Paul Brook @ 2010-03-17 0:05 UTC (permalink / raw)
To: qemu-devel
Cc: ohmura.kei, kvm, Yoshiaki Tamura, Blue Swirl, Avi Kivity,
Richard Henderson
> Where does the translator need access to this original code? I was
> just thinking about this problem today, wondering how much overhead
> there is with this SMC page protection thing.
When an MMU fault occurs qemu re-translates the TB with additional annotations
to determine which guest instruction caused the fault.
See translate-all.c:cpu_restore_state().
Paul
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-16 20:10 ` Blue Swirl
2010-03-16 22:31 ` Richard Henderson
@ 2010-03-17 4:07 ` Avi Kivity
2010-03-17 16:06 ` Paul Brook
1 sibling, 1 reply; 28+ messages in thread
From: Avi Kivity @ 2010-03-17 4:07 UTC (permalink / raw)
To: Blue Swirl; +Cc: qemu-devel, ohmura.kei, Yoshiaki Tamura, kvm
On 03/16/2010 10:10 PM, Blue Swirl wrote:
>
>> Yes, and is what tlb_protect_code() does and it's called from
>> tb_alloc_page() which is what's code when a TB is created.
>>
> Just a tangential note: a long time ago, I tried to disable self
> modifying code detection for Sparc. On most RISC architectures, SMC
> needs explicit flushing so in theory we need not track code memory
> writes. However, during exceptions the translator needs to access the
> original unmodified code that was used to generate the TB. But maybe
> there are other ways to avoid SMC tracking, on x86 it's still needed
>
On x86 you're supposed to execute a serializing instruction (one of
INVD, INVEPT, INVLPG, INVVPID, LGDT, LIDT, LLDT, LTR, MOV (to control
register, with the exception of MOV CR8), MOV (to debug register),
WBINVD, WRMSR, CPUID, IRET, and RSM) before running modified code.
> but I suppose SMC is pretty rare.
>
Every time you demand load a code page from disk, you're running self
modifying code (though it usually doesn't exist in the tlb, so there's
no previous version that can cause trouble).
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-17 4:07 ` Avi Kivity
@ 2010-03-17 16:06 ` Paul Brook
2010-03-17 16:28 ` Avi Kivity
0 siblings, 1 reply; 28+ messages in thread
From: Paul Brook @ 2010-03-17 16:06 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl, ohmura.kei, Avi Kivity, kvm, Yoshiaki Tamura
> On 03/16/2010 10:10 PM, Blue Swirl wrote:
> >> Yes, and is what tlb_protect_code() does and it's called from
> >> tb_alloc_page() which is what's code when a TB is created.
> >
> > Just a tangential note: a long time ago, I tried to disable self
> > modifying code detection for Sparc. On most RISC architectures, SMC
> > needs explicit flushing so in theory we need not track code memory
> > writes. However, during exceptions the translator needs to access the
> > original unmodified code that was used to generate the TB. But maybe
> > there are other ways to avoid SMC tracking, on x86 it's still needed
>
> On x86 you're supposed to execute a serializing instruction (one of
> INVD, INVEPT, INVLPG, INVVPID, LGDT, LIDT, LLDT, LTR, MOV (to control
> register, with the exception of MOV CR8), MOV (to debug register),
> WBINVD, WRMSR, CPUID, IRET, and RSM) before running modified code.
Last time I checked, a jump instruction was sufficient to ensure coherency
withing a core. Serializing instructions are only required for coherency
between cores on SMP systems.
QEMU effectively has a very large physically tagged icache[1] with very
expensive cache loads. AFAIK The only practical way to maintain that cache on
x86 targets is to do write snooping via dirty bits. On targets that mandate
explicit icache invalidation we might be able to get away with this, however I
doubt it actually gains you anything - a correctly written guest is going to
invalidate at least as much as we get from dirty tracking, and we still need
to provide correct behaviour when executing with cache disabled.
> > but I suppose SMC is pretty rare.
>
> Every time you demand load a code page from disk, you're running self
> modifying code (though it usually doesn't exist in the tlb, so there's
> no previous version that can cause trouble).
I think you're confusing TLB flushes with TB flushes.
Paul
[1] Even modern x86 only have relatively small icache. The large L2/L3 caches
aren't relevant as they are unified I/D caches.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty.
2010-03-17 16:06 ` Paul Brook
@ 2010-03-17 16:28 ` Avi Kivity
0 siblings, 0 replies; 28+ messages in thread
From: Avi Kivity @ 2010-03-17 16:28 UTC (permalink / raw)
To: Paul Brook; +Cc: Blue Swirl, ohmura.kei, qemu-devel, kvm, Yoshiaki Tamura
On 03/17/2010 06:06 PM, Paul Brook wrote:
>> On 03/16/2010 10:10 PM, Blue Swirl wrote:
>>
>>>> Yes, and is what tlb_protect_code() does and it's called from
>>>> tb_alloc_page() which is what's code when a TB is created.
>>>>
>>> Just a tangential note: a long time ago, I tried to disable self
>>> modifying code detection for Sparc. On most RISC architectures, SMC
>>> needs explicit flushing so in theory we need not track code memory
>>> writes. However, during exceptions the translator needs to access the
>>> original unmodified code that was used to generate the TB. But maybe
>>> there are other ways to avoid SMC tracking, on x86 it's still needed
>>>
>> On x86 you're supposed to execute a serializing instruction (one of
>> INVD, INVEPT, INVLPG, INVVPID, LGDT, LIDT, LLDT, LTR, MOV (to control
>> register, with the exception of MOV CR8), MOV (to debug register),
>> WBINVD, WRMSR, CPUID, IRET, and RSM) before running modified code.
>>
> Last time I checked, a jump instruction was sufficient to ensure coherency
> withing a core. Serializing instructions are only required for coherency
> between cores on SMP systems.
>
Yeah, the docs say either a jump or a serializing instruction is needed.
> QEMU effectively has a very large physically tagged icache[1] with very
> expensive cache loads. AFAIK The only practical way to maintain that cache on
> x86 targets is to do write snooping via dirty bits. On targets that mandate
> explicit icache invalidation we might be able to get away with this, however I
> doubt it actually gains you anything - a correctly written guest is going to
> invalidate at least as much as we get from dirty tracking, and we still need
> to provide correct behaviour when executing with cache disabled.
>
Agreed.
>
>>> but I suppose SMC is pretty rare.
>>>
>> Every time you demand load a code page from disk, you're running self
>> modifying code (though it usually doesn't exist in the tlb, so there's
>> no previous version that can cause trouble).
>>
> I think you're confusing TLB flushes with TB flushes.
>
No - my thinking was page fault, load page, invlpg, continue. But the
invlpg is unneeded, and "continue" has to include a jump anyway.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2010-03-17 16:28 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-16 10:53 [Qemu-devel] [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 1/6] qemu-kvm: Introduce bit-based phys_ram_dirty for VGA, CODE and MIGRATION Yoshiaki Tamura
2010-03-16 12:26 ` [Qemu-devel] " Avi Kivity
2010-03-16 13:01 ` Yoshiaki Tamura
2010-03-16 13:04 ` Avi Kivity
2010-03-16 10:53 ` [Qemu-devel] [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty Yoshiaki Tamura
2010-03-16 12:45 ` [Qemu-devel] " Avi Kivity
2010-03-16 13:17 ` Yoshiaki Tamura
2010-03-16 13:29 ` Avi Kivity
2010-03-16 13:49 ` Yoshiaki Tamura
2010-03-16 13:51 ` Anthony Liguori
2010-03-16 13:57 ` Avi Kivity
2010-03-16 14:50 ` Anthony Liguori
2010-03-16 20:10 ` Blue Swirl
2010-03-16 22:31 ` Richard Henderson
2010-03-17 0:05 ` Paul Brook
2010-03-17 4:07 ` Avi Kivity
2010-03-17 16:06 ` Paul Brook
2010-03-17 16:28 ` Avi Kivity
2010-03-16 13:35 ` Anthony Liguori
2010-03-16 22:50 ` Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 3/6] qemu-kvm: Replace direct phys_ram_dirty access with wrapper functions Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 4/6] qemu-kvm: Introduce cpu_physical_memory_get_dirty_range() Yoshiaki Tamura
2010-03-16 12:47 ` [Qemu-devel] " Avi Kivity
2010-03-16 10:53 ` [Qemu-devel] [PATCH 5/6] qemu-kvm: Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty Yoshiaki Tamura
2010-03-16 10:53 ` [Qemu-devel] [PATCH 6/6] qemu-kvm: Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages Yoshiaki Tamura
2010-03-16 13:11 ` [Qemu-devel] Re: [PATCH 0/6] qemu-kvm: Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Avi Kivity
2010-03-16 13:41 ` Yoshiaki Tamura
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).