Re: [Qemu-devel] [RFC] optimization for qcow2 cache get/put

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Zhang Haoyu" <zhanghy@sangfor.com.cn>
To: Max Reitz <mreitz@redhat.com>, qemu-devel <qemu-devel@nongnu.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <famz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [RFC] optimization for qcow2 cache get/put
Date: Tue, 27 Jan 2015 09:23:08 +0800	[thread overview]
Message-ID: <201501270923069536238@sangfor.com.cn> (raw)
In-Reply-To: 201501262119592629551@sangfor.com.cn


On 2015-01-26 22:11:59, Max Reitz wrote:
>On 2015-01-26 at 08:20, Zhang Haoyu wrote:
> > Hi, all
> >
> > Regarding too large qcow2 image, e.g., 2TB,
> > so long disruption happened when performing snapshot,
>> which was caused by cache update and IO wait.
> > perf top data shown as below,
> >     PerfTop:    2554 irqs/sec  kernel: 0.4%  exact:  0.0% [4000Hz cycles],  (target_pid: 34294)
> > ------------------------------------------------------------------------------------------------------------------------
>>
> >      33.80%  qemu-system-x86_64  [.] qcow2_cache_do_get
> >      27.59%  qemu-system-x86_64  [.] qcow2_cache_put
> >      15.19%  qemu-system-x86_64  [.] qcow2_cache_entry_mark_dirty
> >       5.49%  qemu-system-x86_64  [.] update_refcount
>>       3.02%  libpthread-2.13.so  [.] pthread_getspecific
> >       2.26%  qemu-system-x86_64  [.] get_refcount
> >       1.95%  qemu-system-x86_64  [.] coroutine_get_thread_state
> >       1.32%  qemu-system-x86_64  [.] qcow2_update_snapshot_refcount
>>       1.20%  qemu-system-x86_64  [.] qemu_coroutine_self
> >       1.16%  libz.so.1.2.7       [.] 0x0000000000003018
> >       0.95%  qemu-system-x86_64  [.] qcow2_update_cluster_refcount
> >       0.91%  qemu-system-x86_64  [.] qcow2_cache_get
> >       0.76%  libc-2.13.so        [.] 0x0000000000134e49
>>       0.73%  qemu-system-x86_64  [.] bdrv_debug_event
> >       0.16%  qemu-system-x86_64  [.] pthread_getspecific@plt
> >       0.12%  [kernel]            [k] _raw_spin_unlock_irqrestore
> >       0.10%  qemu-system-x86_64  [.] vga_draw_line24_32
>>       0.09%  [vdso]              [.] 0x000000000000060c
> >       0.09%  qemu-system-x86_64  [.] qcow2_check_metadata_overlap
> >       0.08%  [kernel]            [k] do_blockdev_direct_IO
> >
> > If expand the cache table size, the IO will be decreased,
>> but the calculation time will be grown.
> > so it's worthy to optimize qcow2 cache get and put algorithm.
> >
> > My proposal:
>> get:
> > using ((use offset >> cluster_bits) % c->size) to locate the cache entry,
> > raw implementation,
> > index = (use offset >> cluster_bits) % c->size;
> > if (c->entries[index].offset == offset) {
>>      goto found;
> > }
> >
> > replace:
>> c->entries[use offset >> cluster_bits) % c->size].offset = offset;
> 
> Well, direct-mapped caches do have their benefits, but remember that 
> they do have disadvantages, too. Regarding CPU caches, set associative 
> caches seem to be largely favored, so that may be a better idea.
>
Thanks, Max,
I think if direct-mapped caches were used, we can expand the cache table size
to decrease IOs, and cache location is not time-expensive even cpu cache miss
happened.
Of course set associative caches is preferred regarding cpu caches,
but sequential traverse algorithm only provides more probability
for association, but after running some time, the probability
of association maybe reduced, I guess.
I will test the direct-mapped cache, and test result will be posted soon.

> CC'ing Kevin, because it's his code.
> 
> Max
>
> > ...
> >
> > put:
> > using 64-entries cache table to cache
>> the recently got c->entries, i.e., cache for cache,
> > then during put process, firstly search the 64-entries cache,
> > if not found, then the c->entries.
> >
>> Any idea?
> >
> > Thanks,
> > Zhang Haoyu

next prev parent reply	other threads:[~2015-01-27  1:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-26 13:20 [Qemu-devel] [RFC] optimization for qcow2 cache get/put Zhang Haoyu
2015-01-26 14:11 ` Max Reitz
2015-01-27  1:23 ` Zhang Haoyu [this message]
2015-01-27  3:53 ` Zhang Haoyu
2015-03-26 14:33 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201501270923069536238@sangfor.com.cn \
    --to=zhanghy@sangfor.com.cn \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).