qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Liu Yuan <namei.unix@gmail.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	sheepdog@lists.wpkg.org, qemu-devel@nongnu.org,
	MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Subject: Re: [Qemu-devel] [PATCH 2/2] sheepdog: support 'qemu-img snapshot -a'
Date: Fri, 7 Jun 2013 18:48:26 +0200	[thread overview]
Message-ID: <20130607164826.GI3658@dhcp-200-207.str.redhat.com> (raw)
In-Reply-To: <51B206E2.6020007@gmail.com>

Am 07.06.2013 um 18:14 hat Liu Yuan geschrieben:
> On 06/07/2013 11:22 PM, Kevin Wolf wrote:
> > Am 07.06.2013 um 15:48 hat Liu Yuan geschrieben:
> >> On 06/07/2013 03:31 PM, Kevin Wolf wrote:
> >>> Am 06.06.2013 um 15:09 hat Liu Yuan geschrieben:
> >>>> On 06/06/2013 08:46 PM, Kevin Wolf wrote:
> >>>>> Am 06.06.2013 um 13:57 hat Liu Yuan geschrieben:
> >>>> Only when the write comes from VM, we do the following stuff
> >>>>  - delete active vdi A
> >>>>  - created a new inode based on the previously reloaded As1's inode
> >>>
> >>> Thanks, this is the part that I missed.
> >>>
> >>> I'm not sure however why the actual switch is delayed until the first
> >>> write. This seems inconsistent with qcow2 snapshots.
> >>>
> >>> Do you know if there is a reason why we can't always do this already
> >>> during bdrv_snapshot_goto()?
> >>>
> >>
> >> I think the reason is sd_load_vmstate() need to load vm state objects
> >> with the correct inode object.
> >>
> >> I tried to remove
> >>
> >>   if (!s->inode.vm_state_size)
> >>
> >> and make sd_create_branch unconditional. This means 'loadvm' command
> >> will try to call sd_create_branch() inside sd_snapshot_goto(). But
> >> failed with reading the wrong snapshot because the vdi's inode object is
> >> changed by sd_create_branch().
> > 
> > Ok, I think I start to understand how these things work. Basically,
> > qemu's savevm functionality is designed to work with qcow2 like this:
> > 
> > 1. Save the VM state to the active layer
> > 2. create_snapshot saves the active layer including VM state
> > 3. [ Forget to remove the VM state from the active layer :-) ]
> > 4. loadvm loads the snapshot again, including VM state
> > 5. VM state is restored from the active layer
> > 
> > So for Sheepdog, the problem is that step 2 doesn't include the VM state,
> > right? So our options are either to change Sheepdog so that step 2 does
> > involve moving the VM state (might end up rather ugly) or you need to
> > swap steps 1 and 2, so that you first switch to the new snapshot and
> > then write the VM state.
> > 
> 
> Sorry, I didn't fully understand the above example. If sheepdog takes
> snapshots, vmstate will go with the exact current active disk into the
> cluster storage, for e.g
> 
> 1 we take two snapshots on the disk 'test' with tag A and B respectively
> 2 disk(A) + vmstate(A) will be stored as indexed by A's vdi
>   disk(B) + vmstate(B) will be stored as indexed by B's vdi
> 
> The chain is A --> B --> C, C is the current active vdi. Note, both A, B
> and C has different vdi_id.
> 
> The problem show up when we do loadvm A, the process is:
> 
> 1 reload inode of A (in snapshot_goto)
> 2 read vmstate via A's vdi_id (loadvm_state)
> 3 delete C and create C1, reload inode of C1 (sd_create_branch on write)
> 
> So if at stage 1, we call sd_create_branch(), the inode will point to C1
> and stage 2 will fail to read vmstate because vdi_id is C1's.
> 
> This is why we rely on the write to call sd_create_branch(). I am not
> sure if we can fix sheepdog's loadvm without touching the core code of QEMU.

Hm, yes, I was confused. :-)

Anyway, the options stay the same: Either C1 must somehow inherit the VM
state from A on the Sheepdog level, or we must make sure to get this
order:

1. Switch to (read-only) snapshot A
2. Load the VM state
3. Create (writable) snapshot C1 and switch to it

This is a different order as required for qcow2 (where C1 would inherit
the VM state from A, so our current first 1, then 3, then 2 works), but
if we can't fix it on the Sheepdog side, we'll have to touch the core
code of qemu. I don't like much to touch the block.c code for it, but if
it's required to fix a bug, let's just do it.

> > Because I think what we're doing currently is not only hard to
> > understand, but actually wrong. Consider this:
> > 
> > 1. savevm with a Sheepdog image
> > 2. Exit qemu before any write request is sent
> > 3. Restart qemu with the Sheepdog image and notice how the wrong
> >    snapshot is active. Oops.
> > 
> 
> Sheepdog indeed has this problem.

Thanks for confirming.

Kevin

  reply	other threads:[~2013-06-07 16:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-06 11:57 [Qemu-devel] [PATCH 0/2] fix 'qemu-img snapshot -a' operation for sheepdog Liu Yuan
2013-06-06 11:57 ` [Qemu-devel] [PATCH 1/2] sheepdog: fix snapshot tag initialization Liu Yuan
2013-06-06 12:46   ` Kevin Wolf
2013-06-06 11:57 ` [Qemu-devel] [PATCH 2/2] sheepdog: support 'qemu-img snapshot -a' Liu Yuan
2013-06-06 12:46   ` Kevin Wolf
2013-06-06 13:09     ` Liu Yuan
2013-06-07  7:31       ` Kevin Wolf
2013-06-07 13:48         ` Liu Yuan
2013-06-07 15:22           ` Kevin Wolf
2013-06-07 16:14             ` Liu Yuan
2013-06-07 16:48               ` Kevin Wolf [this message]
2013-06-07 17:23                 ` Liu Yuan
2013-06-06 13:41     ` Liu Yuan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130607164826.GI3658@dhcp-200-207.str.redhat.com \
    --to=kwolf@redhat.com \
    --cc=morita.kazutaka@lab.ntt.co.jp \
    --cc=namei.unix@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sheepdog@lists.wpkg.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).