All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: kwolf@redhat.com, qemu-devel@nongnu.org, armbru@redhat.com
Subject: [Qemu-devel] Re: [PATCH 04/22] savevm: do_loadvm(): Always resume the VM
Date: Wed, 21 Apr 2010 17:39:00 +0200	[thread overview]
Message-ID: <m3mxwwolwr.fsf@trasno.mitica> (raw)
In-Reply-To: <20100421115410.5226f1dc@redhat.com> (Luiz Capitulino's message of "Wed, 21 Apr 2010 11:54:10 -0300")

Luiz Capitulino <lcapitulino@redhat.com> wrote:
> On Wed, 21 Apr 2010 10:36:29 +0200
> Juan Quintela <quintela@redhat.com> wrote:
>
>>     QTAILQ_FOREACH(dinfo, &drives, next) {
>>         bs1 = dinfo->bdrv;
>>         if (bdrv_has_snapshot(bs1)) {
>> 
>> /// We found a device that has snapshots
>>             ret = bdrv_snapshot_goto(bs1, name);
>>             if (ret < 0) {
>> /// And don't have a snapshot with the name that we wanted
>>                 switch(ret) {
>>                 case -ENOTSUP:
>>                     error_report("%sSnapshots not supported on device '%s'",
>>                                  bs != bs1 ? "Warning: " : "",
>>                                  bdrv_get_device_name(bs1));
>>                     break;
>>                 case -ENOENT:
>>                     error_report("%sCould not find snapshot '%s' on device '%s'",
>>                                  bs != bs1 ? "Warning: " : "",
>>                                  name, bdrv_get_device_name(bs1));
>>                     break;
>>                 default:
>>                     error_report("%sError %d while activating snapshot on '%s'",
>>                                  bs != bs1 ? "Warning: " : "",
>>                                  ret, bdrv_get_device_name(bs1));
>>                     break;
>>                 }
>>                 /* fatal on snapshot block device */
>> // I think that one inconditional exit with predjuice could be in order here
>> 
>> // Notice that bdrv_snapshot_goto() modifies the disk, name is as bad as
>> // you can get.  It just open the disk, opens the snapshot, increases
>> // its counter of users, and makes it available for use after here
>> // (i.e. loading state, posibly conflicting with previous running
>> // VM a.k.a. disk corruption.
>> 
>>                 if (bs == bs1)
>>                     return 0;
>> 
>> // This error is as bad as it can gets :(  We have to load a vmstate,
>> // and the disk that should have the memory image don't have it.
>> // This is an error, I just put the wrong nunmber the previous time.
>> // Notice that this error should be very rare.
>
>  So, the current code is buggy and if you fix it (by returning -1)
> you'll get another bug: loadvm will stop the VM for trivial errors
> like a not found image.

It is not a trivial error!!!!  And worse, it is not recoverable :(

>  How do you plan to fix this?

Returning error and stoping machine.

>> As stated, I don't think that trying to run the machine at any point
>> would make any sense.  Only case where it is safe to run it is if the
>> failure is at get_bs_snapshots(), but at that point running the machine
>> means:
>
>  Actually, it must not pause the VM when recovery is (clearly) possible,
> otherwise it's a usability bug for the user Monitor and a possibly serious
> bug when you don't have human intervention (eg. QMP).

It is not posible, we have change the device status from what was
before.  bets are off.  we don't have a way to go back to the "safe state".

>> <something happens>
>> $ loadvm other_image
>>   Error "other_image" snapshot don't exist.
>> $
>> 
>> running the previous VM looks like something that should be done
>> explicitely.  If the error happened after that get_bs_snapshots(),
>> We would need a new flag to just refuse to continue.  Only valid
>> operations at that point are other loadvm operations, i.e. our state is
>> wrong one way or another.
>
>  It's not clear to me how this flag can help, but anyway, what we need
> here is:
>
> 1. Fail when failure is reported (vs. report a failure and return OK)

This is a bug, plain an simple.

> 2. Don't keep the VM paused when recovery is possible
>
>  If you can fix that, it's ok to me: I'll drop this and the next patch.
>
>  Otherwise I'll have to insist on the split.

Re-read my email.   At this point, nothing is fixable :(  After doing
the 1st:

>>             ret = bdrv_snapshot_goto(bs1, name);

and not returning an error -> state has changed, period.  You can't
restart the machine.

If you prefer, you can chang loadvm in a way that after a failure -> you
can't "cont" it until you get a "working" loadvm.

Later, Juan.

  reply	other threads:[~2010-04-21 15:39 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-20 21:09 [Qemu-devel] [RFC 00/22]: QMP: Convert savevm/loadvm/delvm Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 01/22] QMP: Introduce RESUME event Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 02/22] savevm: Don't check the return of qemu_fopen_bdrv() Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 03/22] savevm: Introduce delete_snapshot() and use it Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 04/22] savevm: do_loadvm(): Always resume the VM Luiz Capitulino
2010-04-20 21:28   ` [Qemu-devel] " Juan Quintela
2010-04-20 21:59     ` Luiz Capitulino
2010-04-21  8:36       ` Juan Quintela
2010-04-21 14:54         ` Luiz Capitulino
2010-04-21 15:39           ` Juan Quintela [this message]
2010-04-21 15:42             ` Kevin Wolf
2010-04-22 13:33               ` Luiz Capitulino
2010-04-21 13:28   ` Kevin Wolf
2010-04-21 15:08     ` Luiz Capitulino
2010-04-21 15:27       ` Kevin Wolf
2010-04-21 15:47         ` Juan Quintela
2010-04-21 15:45       ` Juan Quintela
2010-04-21 17:50         ` Jamie Lokier
2010-04-20 21:09 ` [Qemu-devel] [PATCH 05/22] savevm: load_vmstate(): Return 'ret' on error Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 06/22] savevm: load_vmstate(): Improve error check Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 07/22] qemu-error: Introduce get_errno_string() Luiz Capitulino
2010-04-21  8:28   ` Daniel P. Berrange
2010-04-21 13:38     ` Kevin Wolf
2010-04-21 14:42       ` malc
2010-04-21 15:12         ` Luiz Capitulino
2010-04-21 15:15           ` Daniel P. Berrange
2010-04-21 15:29             ` Luiz Capitulino
2010-04-21 17:13             ` Markus Armbruster
2010-04-22 13:44               ` Luiz Capitulino
2010-05-03 18:00     ` Anthony Liguori
2010-05-10 17:50       ` Markus Armbruster
2010-05-11 22:36       ` Jamie Lokier
2010-04-20 21:09 ` [Qemu-devel] [PATCH 08/22] QError: New QERR_SNAPSHOT_NO_DEVICE Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 09/22] QError: New QERR_SNAPSHOT_DELETE_FAILED Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 10/22] QError: New QERR_SNAPSHOT_CREATE_FAILED Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 11/22] QError: New QERR_SNAPSHOT_ACTIVATE_FAILED Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 12/22] QError: New QERR_STATEVM_SAVE_FAILED Luiz Capitulino
2010-04-20 21:31   ` [Qemu-devel] " Juan Quintela
2010-04-20 22:02     ` Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 13/22] QError: New QERR_STATEVM_LOAD_FAILED Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 14/22] QError: New QERR_DEVICE_NO_SNAPSHOT Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 15/22] QError: New QERR_SNAPSHOT_NOT_FOUND Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 16/22] savevm: Convert delete_snapshot() to QError Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 17/22] savevm: delete_snapshot(): Remove unused parameter Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 18/22] savevm: Convert do_delvm() to QObject, QError Luiz Capitulino
2010-04-21 14:18   ` [Qemu-devel] " Kevin Wolf
2010-04-22 13:48     ` Luiz Capitulino
2010-04-22 14:31       ` Kevin Wolf
2010-04-20 21:09 ` [Qemu-devel] [PATCH 19/22] savevm: Convert do_savevm() to QError Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 20/22] savevm: Convert do_savevm() to QObject Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 21/22] savevm: Convert do_loadvm() to QError Luiz Capitulino
2010-04-20 21:09 ` [Qemu-devel] [PATCH 22/22] savevm: Convert do_loadvm() to QObject Luiz Capitulino
2010-04-20 21:41 ` [Qemu-devel] Re: [RFC 00/22]: QMP: Convert savevm/loadvm/delvm Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3mxwwolwr.fsf@trasno.mitica \
    --to=quintela@redhat.com \
    --cc=armbru@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.