All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Cody <jcody@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: kwolf@redhat.com, sw@weilnetz.de, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 7/7] block: add write support for VHDX images
Date: Thu, 7 Mar 2013 11:05:13 -0500	[thread overview]
Message-ID: <20130307160513.GG22782@localhost.localdomain> (raw)
In-Reply-To: <20130307155905.GE27175@stefanha-thinkpad.redhat.com>

On Thu, Mar 07, 2013 at 04:59:05PM +0100, Stefan Hajnoczi wrote:
> On Wed, Mar 06, 2013 at 09:48:43AM -0500, Jeff Cody wrote:
> > @@ -958,12 +960,150 @@ exit:
> >      return ret;
> >  }
> >  
> > +/*
> > + * Allocate a new payload block at the end of the file.
> > + *
> > + * Allocation will happen at 1MB alignment inside the file
> > + *
> > + * Returns the file offset start of the new payload block
> > + */
> > +static int vhdx_allocate_block(BlockDriverState *bs, BDRVVHDXState *s,
> > +                                    uint64_t *new_offset)
> > +{
> > +    *new_offset = bdrv_getlength(bs->file);
> > +
> > +    /* per the spec, the address for a block is in units of 1MB */
> > +    if (*new_offset % (1024*1024)) {
> > +        *new_offset = ((*new_offset >> 20) + 1) << 20;  /* round up to 1MB */
> > +    }
> > +    return bdrv_truncate(bs->file, *new_offset + s->block_size);
> > +}
> 
> Is it necessary to resize the file?  Why not just write at EOF to grow
> it?
> 

After the recent bdrv_truncate() discussion, I think that may be best.

> > +/*
> > + * Update the BAT tablet entry with the new file offset, and the new entry
> > + * state */
> > +static int vhdx_update_bat_table_entry(BlockDriverState *bs, BDRVVHDXState *s,
> > +                                       vhdx_sector_info *sinfo, int state)
> > +{
> > +    uint64_t bat_tmp;
> > +    uint64_t bat_entry_offset;
> > +
> > +    /* The BAT entry is a uint64, with 44 bits for the file offset in units of
> > +     * 1MB, and 3 bits for the block state. */
> > +    s->bat[sinfo->bat_idx]  = ((sinfo->file_offset>>20) <<
> > +                               VHDX_BAT_FILE_OFF_BITS);
> > +
> > +    s->bat[sinfo->bat_idx] |= state & VHDX_BAT_STATE_BIT_MASK;
> >  
> > +    bat_tmp = cpu_to_le64(s->bat[sinfo->bat_idx]);
> > +    bat_entry_offset = s->bat_offset + sinfo->bat_idx * sizeof(vhdx_bat_entry);
> > +
> > +    return bdrv_pwrite_sync(bs->file, bat_entry_offset, &bat_tmp,
> > +                            sizeof(vhdx_bat_entry));
> > +}
> >  
> >  static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t sector_num,
> >                                        int nb_sectors, QEMUIOVector *qiov)
> >  {
> > -    return -ENOTSUP;
> > +    int ret = -ENOTSUP;
> > +    BDRVVHDXState *s = bs->opaque;
> > +    vhdx_sector_info sinfo;
> > +    uint64_t bytes_done = 0;
> > +    QEMUIOVector hd_qiov;
> > +
> > +    qemu_iovec_init(&hd_qiov, qiov->niov);
> > +
> > +    qemu_co_mutex_lock(&s->lock);
> > +
> > +    /* Per the spec, on the first write of guest-visible data to the file the
> > +     * data write guid must be updated in the header */
> > +    if (s->first_visible_write) {
> > +        s->first_visible_write = false;
> > +        vhdx_update_headers(bs, s, true);
> > +    }
> > +
> > +    while (nb_sectors > 0) {
> > +        if (s->params.data_bits & VHDX_PARAMS_HAS_PARENT) {
> > +            /* not supported yet */
> > +            ret = -ENOTSUP;
> > +            goto exit;
> > +        } else {
> > +            vhdx_block_translate(s, sector_num, nb_sectors, &sinfo);
> > +
> > +            qemu_iovec_reset(&hd_qiov);
> > +            qemu_iovec_concat(&hd_qiov, qiov,  bytes_done, sinfo.bytes_avail);
> > +            /* check the payload block state */
> > +            switch (s->bat[sinfo.bat_idx] & VHDX_BAT_STATE_BIT_MASK) {
> > +            case PAYLOAD_BLOCK_ZERO:
> > +                /* in this case, we need to preserve zero writes for
> > +                 * data that is not part of this write, so we must pad
> > +                 * the rest of the buffer to zeroes */
> > +
> > +                /* if we are on a posix system with ftruncate() that extends
> > +                 * a file, then it is zero-filled for us.  On Win32, the raw
> > +                 * layer uses SetFilePointer and SetFileEnd, which does not
> > +                 * zero fill AFAIK */
> > +
> > +                /* TODO: queue another write of zero buffers if the host OS does
> > +                 * not zero-fill on file extension */
> > +
> > +                /* fall through */
> > +            case PAYLOAD_BLOCK_NOT_PRESENT: /* fall through */
> > +            case PAYLOAD_BLOCK_UNMAPPED:    /* fall through */
> > +            case PAYLOAD_BLOCK_UNDEFINED:   /* fall through */
> > +                ret = vhdx_allocate_block(bs, s, &sinfo.file_offset);
> > +                if (ret < 0) {
> > +                    goto exit;
> > +                }
> > +                /* once we support differencing files, this may also be
> > +                 * partially present */
> > +                /* update block state to the newly specified state */
> > +                ret = vhdx_update_bat_table_entry(bs, s, &sinfo,
> > +                                                  PAYLOAD_BLOCK_FULL_PRESENT);
> > +                if (ret < 0) {
> > +                    goto exit;
> > +                }
> 
> The BAT table entry must be written after data is already on disk.
> Otherwise a crash partway through would leave the BAT table pointing to
> undefined data.
> 
> Do you need to use the log to ensure this?

Yes, indeed.  The log support is what I am working on now - all
metadata and BAT writes are supposed to go through the log, for this
reason.  The next version will have log support, and so I will use
that for the BAT writes (and any other metadata writes, except the
header).

      reply	other threads:[~2013-03-07 16:05 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-06 14:46 [Qemu-devel] [PATCH 0/7] Initial VHDX support (and a bug fix for QCOW) Jeff Cody
2013-03-06 14:47 ` [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking Jeff Cody
2013-03-06 17:50   ` Peter Lieven
2013-03-06 18:06     ` Paolo Bonzini
2013-03-06 18:14       ` Jeff Cody
2013-03-06 18:31         ` Paolo Bonzini
2013-03-06 18:48           ` Jeff Cody
2013-03-06 19:03             ` Peter Lieven
2013-03-06 20:39               ` Paolo Bonzini
2013-03-07  8:50                 ` Kevin Wolf
2013-03-07  8:56                   ` Peter Lieven
2013-03-07  9:03                     ` Kevin Wolf
2013-03-07  9:16                       ` Peter Lieven
2013-03-07  9:22                         ` Kevin Wolf
2013-03-07  9:25                           ` Peter Lieven
2013-03-07 10:00                             ` Kevin Wolf
2013-03-07 10:22                               ` Peter Lieven
2013-03-07 16:45                   ` Paolo Bonzini
2013-03-07  8:53                 ` Peter Lieven
2013-03-07  8:59                   ` Kevin Wolf
2013-03-07 16:09                   ` Paolo Bonzini
2013-03-08  7:53                     ` Peter Lieven
2013-03-08  9:23                       ` Paolo Bonzini
2013-03-08  9:35                         ` Kevin Wolf
2013-03-08 11:46                           ` Peter Lieven
2013-03-08 11:56                             ` Kevin Wolf
2013-03-09  9:36                               ` Peter Lieven
     [not found]         ` <51378A23.5090301@dlhnet.de>
     [not found]           ` <20130306184217.GC3743@localhost.localdomain>
2013-03-06 18:46             ` Peter Lieven
2013-03-06 18:27       ` Peter Lieven
2013-03-07  8:57         ` Kevin Wolf
2013-03-06 18:32     ` Jeff Cody
2013-03-06 20:22       ` Paolo Bonzini
2013-03-06 14:47 ` [Qemu-devel] [PATCH 2/7] qemu: add castagnoli crc32c checksum algorithm Jeff Cody
2013-03-06 14:47 ` [Qemu-devel] [PATCH 3/7] block: vhdx header for the QEMU support of VHDX images Jeff Cody
2013-03-07 13:15   ` Stefan Hajnoczi
2013-03-07 13:43     ` Jeff Cody
2013-03-06 14:48 ` [Qemu-devel] [PATCH 4/7] block: initial VHDX driver support framework - supports open and probe Jeff Cody
2013-03-07 14:30   ` Stefan Hajnoczi
2013-03-07 15:23     ` Jeff Cody
2013-03-07 16:12       ` Stefan Hajnoczi
2013-03-08  8:35         ` Kevin Wolf
2013-03-06 14:48 ` [Qemu-devel] [PATCH 5/7] block: add read-only support to VHDX image format Jeff Cody
2013-03-06 14:48 ` [Qemu-devel] [PATCH 6/7] block: add header update capability for VHDX images Jeff Cody
2013-03-07 14:50   ` Stefan Hajnoczi
2013-03-07 15:33     ` Jeff Cody
2013-03-07 16:15       ` Stefan Hajnoczi
2013-03-06 14:48 ` [Qemu-devel] [PATCH 7/7] block: add write support " Jeff Cody
2013-03-07 15:59   ` Stefan Hajnoczi
2013-03-07 16:05     ` Jeff Cody [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130307160513.GG22782@localhost.localdomain \
    --to=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.