Re: [Qemu-devel] [PATCH v2] vmdk: Optimize cluster allocation

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2] vmdk: Optimize cluster allocation
Date: Wed, 7 May 2014 11:06:34 +0200	[thread overview]
Message-ID: <20140507090634.GE4045@noname.str.redhat.com> (raw)
In-Reply-To: <20140507085755.GA10558@T430.nay.redhat.com>

Am 07.05.2014 um 10:57 hat Fam Zheng geschrieben:
> On Wed, 05/07 10:20, Kevin Wolf wrote:
> > Am 07.05.2014 um 03:45 hat Fam Zheng geschrieben:
> > > On Tue, 05/06 10:32, Fam Zheng wrote:
> > > > On mounted NFS filesystem, ftruncate is much much slower than doing a
> > > > zero write. Changing this significantly speeds up cluster allocation.
> > > > 
> > > > Comparing by converting a cirros image (296M) to VMDK on an NFS mount
> > > > point, over 1Gbe LAN:
> > > > 
> > > >     $ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk
> > > > 
> > > > Before:
> > > >     real    0m26.464s
> > > >     user    0m0.133s
> > > >     sys     0m0.527s
> > > > 
> > > > After:
> > > >     real    0m2.120s
> > > >     user    0m0.080s
> > > >     sys     0m0.197s
> > > > 
> > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > > 
> > > > ---
> > > > V2: Fix cluster_offset check. (Kevin)
> > > > 
> > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > > ---
> > > >  block/vmdk.c | 19 ++++++++++++++-----
> > > >  1 file changed, 14 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/block/vmdk.c b/block/vmdk.c
> > > > index 06a1f9f..98d2d56 100644
> > > > --- a/block/vmdk.c
> > > > +++ b/block/vmdk.c
> > > > @@ -1037,6 +1037,7 @@ static int get_cluster_offset(BlockDriverState *bs,
> > > >      int min_index, i, j;
> > > >      uint32_t min_count, *l2_table;
> > > >      bool zeroed = false;
> > > > +    int64_t ret;
> > > >  
> > > >      if (m_data) {
> > > >          m_data->valid = 0;
> > > > @@ -1110,12 +1111,20 @@ static int get_cluster_offset(BlockDriverState *bs,
> > > >          }
> > > >  
> > > >          /* Avoid the L2 tables update for the images that have snapshots. */
> > > > -        *cluster_offset = bdrv_getlength(extent->file);
> > > > +        ret = bdrv_getlength(extent->file);
> > > > +        if (ret < 0 ||
> > > > +            ret & ((extent->cluster_sectors << BDRV_SECTOR_BITS) - 1)) {
> > > > +            return VMDK_ERROR;
> > > > +        }
> > > > +        *cluster_offset = ret;
> > > >          if (!extent->compressed) {
> > > > -            bdrv_truncate(
> > > > -                extent->file,
> > > > -                *cluster_offset + (extent->cluster_sectors << 9)
> > > > -            );
> > > > +            ret = bdrv_write_zeroes(extent->file,
> > > > +                                    *cluster_offset >> BDRV_SECTOR_BITS,
> > > > +                                    extent->cluster_sectors,
> > > > +                                    0);
> > > 
> > > Hi Stefan,
> > > 
> > > By considering a bdrv_write_zeroes as a pre-write, it in general doubles the
> > > write for the whole image, so it's not a good solution.
> > > 
> > > A better way would be removing the bdrv_truncate and require the caller to do
> > > full cluster write (with a bounce buffer if necessary).
> > 
> > Doesn't get_whole_cluster() already ensure that you already write a full
> > cluster to the image file?
> 
> That one is actually called get_backing_cluster(), if you look at the code it
> has. :)

Right, it doesn't do anything without a backing file. This is different
from qcow2, whose mechanism I assumed without reading the code in
detail. :-)

I think it would make sense to rewrite get_whole_cluster() to write the
cluster for both image with a backing file and standalone images; just
that without a backing file it would use memset() to fill the buffer
instead of bdrv_read().

Not sure how easy it would be, but it might be an opportunity to also
change it to write only those parts of the cluster that aren't written
to anyway by the cluster.

Kevin

next prev parent reply	other threads:[~2014-05-07  9:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-06  2:32 [Qemu-devel] [PATCH v2] vmdk: Optimize cluster allocation Fam Zheng
2014-05-07  1:45 ` Fam Zheng
2014-05-07  8:04   ` Stefan Hajnoczi
2014-05-07  8:20   ` Kevin Wolf
2014-05-07  8:57     ` Fam Zheng
2014-05-07  9:06       ` Kevin Wolf [this message]
2014-05-07  9:21         ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140507090634.GE4045@noname.str.redhat.com \
    --to=kwolf@redhat.com \
    --cc=famz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).