qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] memory: an optimization
@ 2016-02-20  2:35 Gonglei
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region Gonglei
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Gonglei @ 2016-02-20  2:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Gonglei, peter.huangpeng

Perf top tells me qemu_get_ram_ptr consume too much cpu cycles.
> 22.56%  qemu-kvm                 [.] address_space_translate
>  13.29%  qemu-kvm                 [.] qemu_get_ram_ptr
>   4.71%  qemu-kvm                 [.] phys_page_find
>   4.43%  qemu-kvm                 [.] address_space_translate_internal
>   3.47%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
>   3.08%  qemu-kvm                 [.] qemu_ram_addr_from_host
>   2.62%  qemu-kvm                 [.] address_space_map
>   2.61%  libc-2.19.so             [.] _int_malloc
>   2.58%  libc-2.19.so             [.] _int_free
>   2.38%  libc-2.19.so             [.] malloc
>   2.06%  libpthread-2.19.so       [.] pthread_mutex_lock
>   1.68%  libc-2.19.so             [.] malloc_consolidate
>   1.35%  libc-2.19.so             [.] __memcpy_sse2_unaligned
>   1.23%  qemu-kvm                 [.] lduw_le_phys
>   1.18%  qemu-kvm                 [.] find_next_zero_bit
>   1.02%  qemu-kvm                 [.] object_unref

And Paolo suggested that we can get rid of qemu_get_ram_ptr
by storing the RAMBlock pointer into the memory region,
instead of the ram_addr_t value. And after appling this change,
I got much better performance indeed.

BTW, PATCH 3 is an occasional find.

Gonglei (3):
  exec: store RAMBlock pointer into memory region
  memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
  memory: Remove the superfluous code

 exec.c                | 48 ++++++++++++++++++++++++++++++------------------
 include/exec/memory.h |  7 +++----
 memory.c              |  3 ++-
 3 files changed, 35 insertions(+), 23 deletions(-)

-- 
1.8.5.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region
  2016-02-20  2:35 [Qemu-devel] [PATCH 0/3] memory: an optimization Gonglei
@ 2016-02-20  2:35 ` Gonglei
  2016-02-22  2:45   ` Fam Zheng
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 2/3] memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length Gonglei
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Gonglei @ 2016-02-20  2:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Gonglei, peter.huangpeng

Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.

Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 exec.c                | 2 ++
 include/exec/memory.h | 1 +
 memory.c              | 1 +
 3 files changed, 4 insertions(+)

diff --git a/exec.c b/exec.c
index 1f24500..e29e369 100644
--- a/exec.c
+++ b/exec.c
@@ -1717,6 +1717,8 @@ ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
         error_propagate(errp, local_err);
         return -1;
     }
+    /* store the ram block pointer into memroy region */
+    mr->ram_block = new_block;
     return addr;
 }
 
diff --git a/include/exec/memory.h b/include/exec/memory.h
index c92734a..23e2e3e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -172,6 +172,7 @@ struct MemoryRegion {
     bool global_locking;
     uint8_t dirty_log_mask;
     ram_addr_t ram_addr;
+    void *ram_block;   /* RAMBlock pointer */
     Object *owner;
     const MemoryRegionIOMMUOps *iommu_ops;
 
diff --git a/memory.c b/memory.c
index 09041ed..b4451dd 100644
--- a/memory.c
+++ b/memory.c
@@ -912,6 +912,7 @@ void memory_region_init(MemoryRegion *mr,
     }
     mr->name = g_strdup(name);
     mr->owner = owner;
+    mr->ram_block = NULL;
 
     if (name) {
         char *escaped_name = memory_region_escape_name(name);
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/3] memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
  2016-02-20  2:35 [Qemu-devel] [PATCH 0/3] memory: an optimization Gonglei
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region Gonglei
@ 2016-02-20  2:35 ` Gonglei
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 3/3] memory: Remove the superfluous code Gonglei
  2016-02-20  9:47 ` [Qemu-devel] [PATCH 0/3] memory: an optimization Paolo Bonzini
  3 siblings, 0 replies; 8+ messages in thread
From: Gonglei @ 2016-02-20  2:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Gonglei, peter.huangpeng

these two functions consume too much cpu overhead to
find the RAMBlock by ram address.

After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 exec.c                | 46 ++++++++++++++++++++++++++++------------------
 include/exec/memory.h |  4 ++--
 memory.c              |  2 +-
 3 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/exec.c b/exec.c
index e29e369..f714238 100644
--- a/exec.c
+++ b/exec.c
@@ -1868,9 +1868,13 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr)
  *
  * Called within RCU critical section.
  */
-void *qemu_get_ram_ptr(ram_addr_t addr)
+void *qemu_get_ram_ptr(RAMBlock *ram_block, ram_addr_t addr)
 {
-    RAMBlock *block = qemu_get_ram_block(addr);
+    RAMBlock *block = ram_block;
+
+    if (block == NULL) {
+        block = qemu_get_ram_block(addr);
+    }
 
     if (xen_enabled() && block->host == NULL) {
         /* We need to check if the requested address is in the RAM
@@ -1891,15 +1895,18 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
  *
  * Called within RCU critical section.
  */
-static void *qemu_ram_ptr_length(ram_addr_t addr, hwaddr *size)
+static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
+                                 hwaddr *size)
 {
-    RAMBlock *block;
+    RAMBlock *block = ram_block;
     ram_addr_t offset_inside_block;
     if (*size == 0) {
         return NULL;
     }
 
-    block = qemu_get_ram_block(addr);
+    if (block == NULL) {
+        block = qemu_get_ram_block(addr);
+    }
     offset_inside_block = addr - block->offset;
     *size = MIN(*size, block->max_length - offset_inside_block);
 
@@ -2027,13 +2034,13 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
     }
     switch (size) {
     case 1:
-        stb_p(qemu_get_ram_ptr(ram_addr), val);
+        stb_p(qemu_get_ram_ptr(NULL, ram_addr), val);
         break;
     case 2:
-        stw_p(qemu_get_ram_ptr(ram_addr), val);
+        stw_p(qemu_get_ram_ptr(NULL, ram_addr), val);
         break;
     case 4:
-        stl_p(qemu_get_ram_ptr(ram_addr), val);
+        stl_p(qemu_get_ram_ptr(NULL, ram_addr), val);
         break;
     default:
         abort();
@@ -2609,7 +2616,7 @@ static MemTxResult address_space_write_continue(AddressSpace *as, hwaddr addr,
         } else {
             addr1 += memory_region_get_ram_addr(mr);
             /* RAM case */
-            ptr = qemu_get_ram_ptr(addr1);
+            ptr = qemu_get_ram_ptr(mr->ram_block, addr1);
             memcpy(ptr, buf, l);
             invalidate_and_set_dirty(mr, addr1, l);
         }
@@ -2700,7 +2707,7 @@ MemTxResult address_space_read_continue(AddressSpace *as, hwaddr addr,
             }
         } else {
             /* RAM case */
-            ptr = qemu_get_ram_ptr(mr->ram_addr + addr1);
+            ptr = qemu_get_ram_ptr(mr->ram_block, mr->ram_addr + addr1);
             memcpy(buf, ptr, l);
         }
 
@@ -2785,7 +2792,7 @@ static inline void cpu_physical_memory_write_rom_internal(AddressSpace *as,
         } else {
             addr1 += memory_region_get_ram_addr(mr);
             /* ROM/RAM case */
-            ptr = qemu_get_ram_ptr(addr1);
+            ptr = qemu_get_ram_ptr(mr->ram_block, addr1);
             switch (type) {
             case WRITE_DATA:
                 memcpy(ptr, buf, l);
@@ -2997,7 +3004,7 @@ void *address_space_map(AddressSpace *as,
 
     memory_region_ref(mr);
     *plen = done;
-    ptr = qemu_ram_ptr_length(raddr + base, plen);
+    ptr = qemu_ram_ptr_length(mr->ram_block, raddr + base, plen);
     rcu_read_unlock();
 
     return ptr;
@@ -3081,7 +3088,8 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
 #endif
     } else {
         /* RAM case */
-        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr)
+        ptr = qemu_get_ram_ptr(mr->ram_block,
+                               (memory_region_get_ram_addr(mr)
                                 & TARGET_PAGE_MASK)
                                + addr1);
         switch (endian) {
@@ -3176,7 +3184,8 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
 #endif
     } else {
         /* RAM case */
-        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr)
+        ptr = qemu_get_ram_ptr(mr->ram_block,
+                               (memory_region_get_ram_addr(mr)
                                 & TARGET_PAGE_MASK)
                                + addr1);
         switch (endian) {
@@ -3291,7 +3300,8 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
 #endif
     } else {
         /* RAM case */
-        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr)
+        ptr = qemu_get_ram_ptr(mr->ram_block,
+                               (memory_region_get_ram_addr(mr)
                                 & TARGET_PAGE_MASK)
                                + addr1);
         switch (endian) {
@@ -3376,7 +3386,7 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
         r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
     } else {
         addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
-        ptr = qemu_get_ram_ptr(addr1);
+        ptr = qemu_get_ram_ptr(mr->ram_block, addr1);
         stl_p(ptr, val);
 
         dirty_log_mask = memory_region_get_dirty_log_mask(mr);
@@ -3431,7 +3441,7 @@ static inline void address_space_stl_internal(AddressSpace *as,
     } else {
         /* RAM case */
         addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
-        ptr = qemu_get_ram_ptr(addr1);
+        ptr = qemu_get_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
             stl_le_p(ptr, val);
@@ -3541,7 +3551,7 @@ static inline void address_space_stw_internal(AddressSpace *as,
     } else {
         /* RAM case */
         addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
-        ptr = qemu_get_ram_ptr(addr1);
+        ptr = qemu_get_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
             stw_le_p(ptr, val);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 23e2e3e..227fbf4 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1390,7 +1390,7 @@ MemTxResult address_space_read_continue(AddressSpace *as, hwaddr addr,
 					MemoryRegion *mr);
 MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr,
                                     MemTxAttrs attrs, uint8_t *buf, int len);
-void *qemu_get_ram_ptr(ram_addr_t addr);
+void *qemu_get_ram_ptr(RAMBlock *ram_block, ram_addr_t addr);
 
 static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
 {
@@ -1431,7 +1431,7 @@ MemTxResult address_space_read(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
             mr = address_space_translate(as, addr, &addr1, &l, false);
             if (len == l && memory_access_is_direct(mr, false)) {
                 addr1 += memory_region_get_ram_addr(mr);
-                ptr = qemu_get_ram_ptr(addr1);
+                ptr = qemu_get_ram_ptr(mr->ram_block, addr1);
                 memcpy(buf, ptr, len);
             } else {
                 result = address_space_read_continue(as, addr, attrs, buf, len,
diff --git a/memory.c b/memory.c
index b4451dd..0dd9695 100644
--- a/memory.c
+++ b/memory.c
@@ -1570,7 +1570,7 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr)
         mr = mr->alias;
     }
     assert(mr->ram_addr != RAM_ADDR_INVALID);
-    ptr = qemu_get_ram_ptr(mr->ram_addr & TARGET_PAGE_MASK);
+    ptr = qemu_get_ram_ptr(mr->ram_block, mr->ram_addr & TARGET_PAGE_MASK);
     rcu_read_unlock();
 
     return ptr + offset;
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 3/3] memory: Remove the superfluous code
  2016-02-20  2:35 [Qemu-devel] [PATCH 0/3] memory: an optimization Gonglei
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region Gonglei
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 2/3] memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length Gonglei
@ 2016-02-20  2:35 ` Gonglei
  2016-02-20  9:47 ` [Qemu-devel] [PATCH 0/3] memory: an optimization Paolo Bonzini
  3 siblings, 0 replies; 8+ messages in thread
From: Gonglei @ 2016-02-20  2:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Gonglei, peter.huangpeng

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 include/exec/memory.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 227fbf4..5f96e6b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1399,8 +1399,6 @@ static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
     } else {
         return memory_region_is_ram(mr) || memory_region_is_romd(mr);
     }
-
-    return false;
 }
 
 /**
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] memory: an optimization
  2016-02-20  2:35 [Qemu-devel] [PATCH 0/3] memory: an optimization Gonglei
                   ` (2 preceding siblings ...)
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 3/3] memory: Remove the superfluous code Gonglei
@ 2016-02-20  9:47 ` Paolo Bonzini
  2016-02-20 10:34   ` Gonglei (Arei)
  3 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2016-02-20  9:47 UTC (permalink / raw)
  To: Gonglei, qemu-devel; +Cc: peter.huangpeng



On 20/02/2016 03:35, Gonglei wrote:
> Perf top tells me qemu_get_ram_ptr consume too much cpu cycles.
>> 22.56%  qemu-kvm                 [.] address_space_translate
>>  13.29%  qemu-kvm                 [.] qemu_get_ram_ptr
>>   4.71%  qemu-kvm                 [.] phys_page_find
>>   4.43%  qemu-kvm                 [.] address_space_translate_internal
>>   3.47%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
>>   3.08%  qemu-kvm                 [.] qemu_ram_addr_from_host
>>   2.62%  qemu-kvm                 [.] address_space_map
>>   2.61%  libc-2.19.so             [.] _int_malloc
>>   2.58%  libc-2.19.so             [.] _int_free
>>   2.38%  libc-2.19.so             [.] malloc
>>   2.06%  libpthread-2.19.so       [.] pthread_mutex_lock
>>   1.68%  libc-2.19.so             [.] malloc_consolidate
>>   1.35%  libc-2.19.so             [.] __memcpy_sse2_unaligned
>>   1.23%  qemu-kvm                 [.] lduw_le_phys
>>   1.18%  qemu-kvm                 [.] find_next_zero_bit
>>   1.02%  qemu-kvm                 [.] object_unref
> 
> And Paolo suggested that we can get rid of qemu_get_ram_ptr
> by storing the RAMBlock pointer into the memory region,
> instead of the ram_addr_t value. And after appling this change,
> I got much better performance indeed.

What's the gain like?

I've not reviewed the patch in depth, but what I can say is that I like
it a lot.  It only does the bare minimum needed to provide the
optimization, but this also makes it very simple to understand.  More
cleanups and further optimizations are possible (including removing
mr->ram_addr completely), but your patches really does one thing and
does it well.  Good job!

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] memory: an optimization
  2016-02-20  9:47 ` [Qemu-devel] [PATCH 0/3] memory: an optimization Paolo Bonzini
@ 2016-02-20 10:34   ` Gonglei (Arei)
  0 siblings, 0 replies; 8+ messages in thread
From: Gonglei (Arei) @ 2016-02-20 10:34 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel@nongnu.org; +Cc: Huangpeng (Peter)

Hi Paolo,


> -----Original Message-----
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo
> Bonzini
> Sent: Saturday, February 20, 2016 5:48 PM
> To: Gonglei (Arei); qemu-devel@nongnu.org
> Cc: Huangpeng (Peter)
> Subject: Re: [PATCH 0/3] memory: an optimization
> 
> 
> 
> On 20/02/2016 03:35, Gonglei wrote:
> > Perf top tells me qemu_get_ram_ptr consume too much cpu cycles.
> >> 22.56%  qemu-kvm                 [.] address_space_translate
> >>  13.29%  qemu-kvm                 [.] qemu_get_ram_ptr
> >>   4.71%  qemu-kvm                 [.] phys_page_find
> >>   4.43%  qemu-kvm                 [.]
> address_space_translate_internal
> >>   3.47%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
> >>   3.08%  qemu-kvm                 [.] qemu_ram_addr_from_host
> >>   2.62%  qemu-kvm                 [.] address_space_map
> >>   2.61%  libc-2.19.so             [.] _int_malloc
> >>   2.58%  libc-2.19.so             [.] _int_free
> >>   2.38%  libc-2.19.so             [.] malloc
> >>   2.06%  libpthread-2.19.so       [.] pthread_mutex_lock
> >>   1.68%  libc-2.19.so             [.] malloc_consolidate
> >>   1.35%  libc-2.19.so             [.] __memcpy_sse2_unaligned
> >>   1.23%  qemu-kvm                 [.] lduw_le_phys
> >>   1.18%  qemu-kvm                 [.] find_next_zero_bit
> >>   1.02%  qemu-kvm                 [.] object_unref
> >
> > And Paolo suggested that we can get rid of qemu_get_ram_ptr
> > by storing the RAMBlock pointer into the memory region,
> > instead of the ram_addr_t value. And after appling this change,
> > I got much better performance indeed.
> 
> What's the gain like?
> 
After rebased on the master branch right now, I found that the qemu_get_ram_ptr is
not one of main consumers. But I also get some bonus from this patch set.

Before this optimization:
  1.26%  qemu-kvm                  [.] qemu_get_ram_ptr
  0.89%  qemu-kvm                  [.] qemu_get_ram_block

Applied the patch set:
 0.87%  qemu-kvm                 [.] qemu_get_ram_ptr

Now the main consumers are (too much different with qemu-2.3):
 6.38%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
  6.02%  qemu-kvm                 [.] vring_desc_read.isra.26
  5.27%  qemu-kvm                 [.] address_space_map
  4.45%  qemu-kvm                 [.] qemu_ram_block_from_host
  4.13%  libpthread-2.19.so       [.] pthread_mutex_lock
  3.95%  libc-2.19.so             [.] _int_free
  3.46%  qemu-kvm                 [.] address_space_translate_internal
  3.40%  qemu-kvm                 [.] address_space_translate
  3.39%  qemu-kvm                 [.] phys_page_find
  3.37%  libc-2.19.so             [.] _int_malloc
  3.21%  qemu-kvm                 [.] stw_le_phys
  2.70%  libc-2.19.so             [.] malloc
  2.18%  qemu-kvm                 [.] lduw_le_phys
  2.15%  libc-2.19.so             [.] __memcpy_sse2_unaligned
  1.58%  qemu-kvm                 [.] address_space_write
  1.48%  libc-2.19.so             [.] memset
  1.22%  qemu-kvm                 [.] virtqueue_map_desc
  1.22%  libc-2.19.so             [.] __libc_calloc
  1.21%  qemu-kvm                 [.] virtio_notify

And the speed based on the master branch and my patch series:
 Testing AES-128-CBC cipher: 
        Encrypting in chunks of 256 bytes: done. 506.27 MiB in 5.01 secs: 100.97 MiB/sec (2073684 packets)
        Encrypting in chunks of 256 bytes: done. 505.89 MiB in 5.02 secs: 100.85 MiB/sec (2072106 packets)
        Encrypting in chunks of 256 bytes: done. 505.94 MiB in 5.02 secs: 100.86 MiB/sec (2072343 packets)
        Encrypting in chunks of 256 bytes: done. 505.96 MiB in 5.02 secs: 100.87 MiB/sec (2072412 packets)
        Encrypting in chunks of 256 bytes: done. 505.92 MiB in 5.02 secs: 100.86 MiB/sec (2072241 packets)
        Encrypting in chunks of 256 bytes: done. 506.36 MiB in 5.02 secs: 100.95 MiB/sec (2074057 packets)
        Encrypting in chunks of 256 bytes: done. 506.35 MiB in 5.01 secs: 101.02 MiB/sec (2073998 packets)
        Encrypting in chunks of 256 bytes: done. 505.41 MiB in 5.01 secs: 100.92 MiB/sec (2070157 packets)

> I've not reviewed the patch in depth, but what I can say is that I like
> it a lot.  It only does the bare minimum needed to provide the
> optimization, but this also makes it very simple to understand.  More
> cleanups and further optimizations are possible (including removing
> mr->ram_addr completely), but your patches really does one thing and
> does it well.  Good job!
> 
Thanks!

Regards,
-Gonglei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region
  2016-02-20  2:35 ` [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region Gonglei
@ 2016-02-22  2:45   ` Fam Zheng
  2016-02-22  3:28     ` Gonglei (Arei)
  0 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2016-02-22  2:45 UTC (permalink / raw)
  To: Gonglei; +Cc: pbonzini, qemu-devel, peter.huangpeng

On Sat, 02/20 10:35, Gonglei wrote:
> Each RAM memory region has a unique corresponding RAMBlock.
> In the current realization, the memory region only stored
> the ram_addr which means the offset of RAM address space,
> We need to qurey the global ram.list to find the ram block
> by ram_addr if we want to get the ram block, which is very
> expensive.
> 
> Now, we store the RAMBlock pointer into memory region
> structure. So, if we know the mr, we can easily get the
> RAMBlock.
> 
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  exec.c                | 2 ++
>  include/exec/memory.h | 1 +
>  memory.c              | 1 +
>  3 files changed, 4 insertions(+)
> 
> diff --git a/exec.c b/exec.c
> index 1f24500..e29e369 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1717,6 +1717,8 @@ ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
>          error_propagate(errp, local_err);
>          return -1;
>      }
> +    /* store the ram block pointer into memroy region */

The comment is superfluous IMHO, the code is quite self-explanatory.

> +    mr->ram_block = new_block;
>      return addr;
>  }
>  
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index c92734a..23e2e3e 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -172,6 +172,7 @@ struct MemoryRegion {
>      bool global_locking;
>      uint8_t dirty_log_mask;
>      ram_addr_t ram_addr;
> +    void *ram_block;   /* RAMBlock pointer */

Why not add

    typedef struct RAMBlock RAMBlock;

then

    RAMBlock *ram_block;

?

>      Object *owner;
>      const MemoryRegionIOMMUOps *iommu_ops;
>  
> diff --git a/memory.c b/memory.c
> index 09041ed..b4451dd 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -912,6 +912,7 @@ void memory_region_init(MemoryRegion *mr,
>      }
>      mr->name = g_strdup(name);
>      mr->owner = owner;
> +    mr->ram_block = NULL;
>  
>      if (name) {
>          char *escaped_name = memory_region_escape_name(name);
> -- 
> 1.8.5.2
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region
  2016-02-22  2:45   ` Fam Zheng
@ 2016-02-22  3:28     ` Gonglei (Arei)
  0 siblings, 0 replies; 8+ messages in thread
From: Gonglei (Arei) @ 2016-02-22  3:28 UTC (permalink / raw)
  To: Fam Zheng; +Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, Huangpeng (Peter)

Hi Fam,

> From: Fam Zheng [mailto:famz@redhat.com]
> Sent: Monday, February 22, 2016 10:46 AM
> 
> On Sat, 02/20 10:35, Gonglei wrote:
> > Each RAM memory region has a unique corresponding RAMBlock.
> > In the current realization, the memory region only stored
> > the ram_addr which means the offset of RAM address space,
> > We need to qurey the global ram.list to find the ram block
> > by ram_addr if we want to get the ram block, which is very
> > expensive.
> >
> > Now, we store the RAMBlock pointer into memory region
> > structure. So, if we know the mr, we can easily get the
> > RAMBlock.
> >
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > ---
> >  exec.c                | 2 ++
> >  include/exec/memory.h | 1 +
> >  memory.c              | 1 +
> >  3 files changed, 4 insertions(+)
> >
> > diff --git a/exec.c b/exec.c
> > index 1f24500..e29e369 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -1717,6 +1717,8 @@ ram_addr_t
> qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
> >          error_propagate(errp, local_err);
> >          return -1;
> >      }
> > +    /* store the ram block pointer into memroy region */
> 
> The comment is superfluous IMHO, the code is quite self-explanatory.
> 
Yes, agree.

> > +    mr->ram_block = new_block;
> >      return addr;
> >  }
> >
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index c92734a..23e2e3e 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -172,6 +172,7 @@ struct MemoryRegion {
> >      bool global_locking;
> >      uint8_t dirty_log_mask;
> >      ram_addr_t ram_addr;
> > +    void *ram_block;   /* RAMBlock pointer */
> 
> Why not add
> 
>     typedef struct RAMBlock RAMBlock;
> 
> then
> 
>     RAMBlock *ram_block;
> 
> ?
> 
It's clearer. Will fix in v2, thanks :)

Regards,
-Gonglei

> >      Object *owner;
> >      const MemoryRegionIOMMUOps *iommu_ops;
> >
> > diff --git a/memory.c b/memory.c
> > index 09041ed..b4451dd 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -912,6 +912,7 @@ void memory_region_init(MemoryRegion *mr,
> >      }
> >      mr->name = g_strdup(name);
> >      mr->owner = owner;
> > +    mr->ram_block = NULL;
> >
> >      if (name) {
> >          char *escaped_name = memory_region_escape_name(name);
> > --
> > 1.8.5.2
> >
> >
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-02-22  3:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-20  2:35 [Qemu-devel] [PATCH 0/3] memory: an optimization Gonglei
2016-02-20  2:35 ` [Qemu-devel] [PATCH 1/3] exec: store RAMBlock pointer into memory region Gonglei
2016-02-22  2:45   ` Fam Zheng
2016-02-22  3:28     ` Gonglei (Arei)
2016-02-20  2:35 ` [Qemu-devel] [PATCH 2/3] memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length Gonglei
2016-02-20  2:35 ` [Qemu-devel] [PATCH 3/3] memory: Remove the superfluous code Gonglei
2016-02-20  9:47 ` [Qemu-devel] [PATCH 0/3] memory: an optimization Paolo Bonzini
2016-02-20 10:34   ` Gonglei (Arei)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).