qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] optimization for qcow2 cache get/put
@ 2015-01-26 13:20 Zhang Haoyu
  2015-01-26 14:11 ` Max Reitz
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Zhang Haoyu @ 2015-01-26 13:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Fam Zheng, Stefan Hajnoczi

Hi, all

Regarding too large qcow2 image, e.g., 2TB,
so long disruption happened when performing snapshot,
which was caused by cache update and IO wait.
perf top data shown as below,
   PerfTop:    2554 irqs/sec  kernel: 0.4%  exact:  0.0% [4000Hz cycles],  (target_pid: 34294)
------------------------------------------------------------------------------------------------------------------------

    33.80%  qemu-system-x86_64  [.] qcow2_cache_do_get            
    27.59%  qemu-system-x86_64  [.] qcow2_cache_put               
    15.19%  qemu-system-x86_64  [.] qcow2_cache_entry_mark_dirty  
     5.49%  qemu-system-x86_64  [.] update_refcount               
     3.02%  libpthread-2.13.so  [.] pthread_getspecific           
     2.26%  qemu-system-x86_64  [.] get_refcount                  
     1.95%  qemu-system-x86_64  [.] coroutine_get_thread_state    
     1.32%  qemu-system-x86_64  [.] qcow2_update_snapshot_refcount
     1.20%  qemu-system-x86_64  [.] qemu_coroutine_self           
     1.16%  libz.so.1.2.7       [.] 0x0000000000003018            
     0.95%  qemu-system-x86_64  [.] qcow2_update_cluster_refcount 
     0.91%  qemu-system-x86_64  [.] qcow2_cache_get               
     0.76%  libc-2.13.so        [.] 0x0000000000134e49            
     0.73%  qemu-system-x86_64  [.] bdrv_debug_event              
     0.16%  qemu-system-x86_64  [.] pthread_getspecific@plt       
     0.12%  [kernel]            [k] _raw_spin_unlock_irqrestore   
     0.10%  qemu-system-x86_64  [.] vga_draw_line24_32            
     0.09%  [vdso]              [.] 0x000000000000060c            
     0.09%  qemu-system-x86_64  [.] qcow2_check_metadata_overlap  
     0.08%  [kernel]            [k] do_blockdev_direct_IO  

If expand the cache table size, the IO will be decreased, 
but the calculation time will be grown.
so it's worthy to optimize qcow2 cache get and put algorithm.

My proposal:
get:
using ((use offset >> cluster_bits) % c->size) to locate the cache entry,
raw implementation,
index = (use offset >> cluster_bits) % c->size;
if (c->entries[index].offset == offset) {
    goto found;
}

replace:
c->entries[use offset >> cluster_bits) % c->size].offset = offset;
...

put:
using 64-entries cache table to cache
the recently got c->entries, i.e., cache for cache,
then during put process, firstly search the 64-entries cache,
if not found, then the c->entries.

Any idea?

Thanks,
Zhang Haoyu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-03-26 14:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-26 13:20 [Qemu-devel] [RFC] optimization for qcow2 cache get/put Zhang Haoyu
2015-01-26 14:11 ` Max Reitz
2015-01-27  1:23 ` Zhang Haoyu
2015-01-27  3:53 ` Zhang Haoyu
2015-03-26 14:33 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).