All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Krempa <pkrempa@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>,
	Daniel Henrique Barboza <danielhb413@gmail.com>,
	kwolf@redhat.com, libvir-list@redhat.com, qemu-devel@nongnu.org,
	mreitz@redhat.com, muriloo@linux.ibm.com
Subject: Re: [Qemu-devel] [libvirt] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore
Date: Thu, 11 Oct 2018 13:06:14 +0100	[thread overview]
Message-ID: <20181011120613.GI2483@work-vm> (raw)
In-Reply-To: <20181010075612.GG2052@andariel.pipo.sk>

* Peter Krempa (pkrempa@redhat.com) wrote:
> On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote:
> > Cc: libvir-list for review of the compatibility argument below.
> > 
> > Daniel Henrique Barboza <danielhb413@gmail.com> writes:
> > 
> > > Hey David,
> > >
> > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
> > >> * Daniel Henrique Barboza (danielhb413@gmail.com) wrote:
> > >>> changes in v2:
> > >>> - removed the "RFC" marker;
> > >>> - added a new patch (patch 2) that removes
> > >>> bdrv_snapshot_delete_by_id_or_name from the code;
> > >>> - made changes in patch 1 as suggested by Murilo;
> > >>> - previous patch set link:
> > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
> > >>>
> > >>>
> > >>> It is not uncommon to see bugs being opened by testers that attempt to
> > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
> > >>> common snapshot names and they trigger a lot of bugs. I gave an example
> > >>> in the commit message of patch 1, but to sum up here: QEMU treats the
> > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
> > >>> is documented as such, but this can lead to strange situations.
> > >>>
> > >>> Given that it is strange for an API to consider a parameter to be 2 fields
> > >>> at the same time, and inadvently treating them as one or the other, and
> > >>> that removing the ID field is too drastic, my idea here is to keep the
> > >>> ID field for internal control, but do not let the user set it.
> > >>>
> > >>> I guess there's room for discussion about considering this change an API
> > >>> change or not. It doesn't affect users of HMP and it doesn't affect Libvirt,
> > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
> > >> Can you clarify a couple of things:
> > >>    a) What is it about libvirt's use that means it's OK ? Does it never
> > >> use numeric tags?
> > >
> > > I am glad you asked because my understanding in how Libvirt was dealing
> > > with snapshots was wrong, and I just looked into it further to answer your
> > > question. Luckily, this series fixes the situation for Libvirt as well.
> > >
> > > I was misled by the fact that Libvirt does not show the same symptoms
> > > we see in
> > > QEMU of this problem, but the bug is there. Here's a quick test with
> > > Libvirt with
> > > "0" and "1" as snapshot names, considering a VM named with no snapshots,
> > > using QEMU 2.12 and Libvirt 4.0.0:
> > >
> > > - create the "0" snapshot:
> > >
> > > $ sudo virsh snapshot-create-as --name 0 dhb
> > > Domain snapshot 0 created
> > >
> > > $ sudo virsh snapshot-list dhb
> > > Name Creation Time State
> > > ------------------------------------------------------------
> > > 0 2018-09-24 15:47:56 -0400 running
> > >
> > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > List of snapshots present on all disks:
> > > ID TAG VM SIZE DATE VM CLOCK
> > > --    0  405M 2018-09-24 15:47:56 00:04:20.867
> > >
> > >
> > > - created the "1" snapshot. Here, Libvirt shows both snapshots with
> > > snapshot-list,
> > > but the snapshot was erased inside QEMU:
> > >
> > > $ sudo virsh snapshot-create-as --name 1 dhb
> > > Domain snapshot 1 created
> > > $
> > > $ sudo virsh snapshot-list dhb
> > > Name Creation Time State
> > > ------------------------------------------------------------
> > > 0 2018-09-24 15:47:56 -0400 running
> > > 1 2018-09-24 15:50:09 -0400 running
> > >
> > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > List of snapshots present on all disks:
> > > ID TAG VM SIZE DATE VM CLOCK
> > > --    1  404M 2018-09-24 15:50:10 00:05:36.226
> > >
> > >
> > > This is where I stopped checking out Libvirt at first, concluding
> > > wrongly that it
> > > was immune to the bug.
> > >
> > > Libvirt does not throw an error when trying to apply snapshot 0. In
> > > fact, it acts
> > > like everything went fine:
> > >
> > > $ sudo virsh snapshot-revert dhb 0
> > >
> > > $ echo $?
> > > 0
> > 
> > Is that a libvirt bug?
> 
> Probably yes. The error handling from HMP sucks and can't really be
> fixed in all cases. This is for the handler which calls "loadvm":
> 
>     if (strstr(reply, "No block device supports snapshots") != NULL) {
>         virReportError(VIR_ERR_OPERATION_INVALID, "%s",
>                        _("this domain does not have a device to load snapshots"));
>         goto cleanup;
>     } else if (strstr(reply, "Could not find snapshot") != NULL) {
>         virReportError(VIR_ERR_OPERATION_INVALID,
>                        _("the snapshot '%s' does not exist, and was not loaded"),
>                        name);
>         goto cleanup;
>     } else if (strstr(reply, "Snapshots not supported on device") != NULL) {
>         virReportError(VIR_ERR_OPERATION_INVALID, "%s", reply);
>         goto cleanup;
>     } else if (strstr(reply, "Could not open VM state file") != NULL) {
>         virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
>         goto cleanup;
>     } else if (strstr(reply, "Error") != NULL
>              && strstr(reply, "while loading VM state") != NULL) {
>         virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
>         goto cleanup;
>     } else if (strstr(reply, "Error") != NULL
>              && strstr(reply, "while activating snapshot on") != NULL) {
>         virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
>         goto cleanup;
>     }
> 
> If the above does not match the reported error, we report success. The
> same problem can happen with any of the 5 [1] HMP command handlers we
> still have.
> 
> Note that a similar abomination is also for the code which calls
> "savevm".

Would the following small qemu change make life a little safer:

diff --git a/hmp.c b/hmp.c
index 576253a01f..0729a8c7ed 100644
--- a/hmp.c
+++ b/hmp.c
@@ -62,7 +62,7 @@ static void hmp_handle_error(Monitor *mon, Error **errp)
 {
     assert(errp);
     if (*errp) {
-        error_report_err(*errp);
+        error_reportf_err(*errp, "Error: ");
     }
 }

changing:

No block device supports snapshots


to:

Error: No block device supports snapshots

so you could check for Error: at the start and know that you've hit some
unknown error?

Dave


> [1] We technically have 6 HMP handlers, but "cpu_set" is not used if yoy
> have a QEMU younger than 3 years. Soon also "drive_add" and "drive_del"
> will be replaced, so we'll be stuck with the 3 internal snapshot ones.
> > 
> > > Reverting back to snapshot "1" works as intended, restoring the VM
> > > state when it
> > > was created.
> > >
> > >
> > > (perhaps this is something we should let Libvirt be aware of ...)
> 
> The error message from qemu which was wrongly ignored by qemu can be
> found in the libvirtd debug log. It would be helpful if you could
> provide it.
> 
> > >
> > >
> > >
> > > This series fixes this behavior because the snapshot 0 isn't erased
> > > from QEMU, allowing
> > > Libvirt to successfully restore it.
> > >
> > >
> > >>    b) After this series are you always guaranteed to be able to fix
> > >> any existing oddly named snapshots?
> > >
> > > The oddly named snapshots that already exists can be affected by the
> > > semantic
> > > change in loadvm and delvm, in a way that the user can't load/del
> > > using the snap
> > > ID, just the tag. Aside from that, I don't see any side effects with
> > > existing
> > > snapshots and this patch series.
> > 
> > Do all snapshots have a tag that is unique within their image?  Even
> > snapshots created by old versions of QEMU?
> 
> I remember there was a discussion which regarded problems with
> collisions between the name and the ID of the snapshot when dealing with
> loadvm/delvm commands but I can't seem to find it currently. Note that
> libvirt plainly issues loadvm/delvm $SNAPSHOTNAME. If the name is
> numeric it might clash. Similarly for inactive vms via qemu-img.


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-10-11 12:06 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-06 11:11 [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore Daniel Henrique Barboza
2018-09-06 11:11 ` [Qemu-devel] [PATCH v2 1/3] block/snapshot.c: eliminate use of ID input in snapshot operations Daniel Henrique Barboza
2018-09-06 11:11 ` [Qemu-devel] [PATCH v2 2/3] block/snapshot: remove bdrv_snapshot_delete_by_id_or_name Daniel Henrique Barboza
2018-09-06 11:11 ` [Qemu-devel] [PATCH v2 3/3] qcow2-snapshot: remove redundant find_snapshot_by_id_and_name call Daniel Henrique Barboza
2018-10-08 18:13 ` [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore Daniel Henrique Barboza
     [not found] ` <20180921122954.GD2842@work-vm>
     [not found]   ` <355a1147-c0d0-88fc-7b68-4391bab25c54@gmail.com>
2018-10-09 17:34     ` Markus Armbruster
2018-10-10  7:56       ` [Qemu-devel] [libvirt] " Peter Krempa
2018-10-11 12:06         ` Dr. David Alan Gilbert [this message]
2018-10-11 12:35           ` Peter Krempa
2019-01-09 14:10 ` [Qemu-devel] " Max Reitz
2019-01-09 14:21   ` Kevin Wolf
2019-01-09 14:27     ` Max Reitz
2019-01-09 14:48       ` Kevin Wolf
2019-01-09 14:54         ` Max Reitz
2019-01-09 15:13           ` Kevin Wolf
2019-01-09 15:25             ` Dr. David Alan Gilbert
2019-01-09 16:25             ` Markus Armbruster
2019-01-09 16:27             ` Max Reitz
2019-01-09 16:45               ` Kevin Wolf
2019-01-09 16:58                 ` Max Reitz
2019-01-09 18:19     ` Daniel Henrique Barboza
2019-01-09 16:57   ` Daniel Henrique Barboza
2019-01-09 17:05     ` Max Reitz
2019-01-09 17:20       ` Kevin Wolf
2019-01-09 17:38         ` Max Reitz
2019-01-09 17:52           ` Eric Blake
2019-01-09 19:00             ` Kevin Wolf
2019-01-11 12:35             ` Max Reitz
2019-01-09 17:55           ` Eric Blake
2019-01-09 18:51             ` Kevin Wolf
2019-01-09 19:02               ` Eric Blake
2019-01-10 11:25                 ` Kevin Wolf
2019-01-10 11:41                   ` Dr. David Alan Gilbert
2019-01-10 13:03                     ` Daniel Henrique Barboza
2019-01-10 15:11                     ` Kevin Wolf
2019-01-10 17:06                       ` Dr. David Alan Gilbert
2019-01-10 18:22                         ` Eric Blake
2019-01-11 12:14                           ` Kevin Wolf
2019-01-11 13:22                     ` Max Reitz
2019-01-11 14:33                       ` Kevin Wolf
2019-01-11 15:23                         ` Max Reitz
2019-01-09 17:32       ` Daniel Henrique Barboza
2019-01-09 17:07     ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181011120613.GI2483@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=danielhb413@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=muriloo@linux.ibm.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.