From: Anthony Liguori <anthony@codemonkey.ws>
To: Juan Quintela <quintela@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>,
qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
Luiz Capitulino <lcapitulino@redhat.com>
Subject: [Qemu-devel] Re: [CFR 6/10] cont command
Date: Wed, 16 Jun 2010 12:05:15 -0500 [thread overview]
Message-ID: <4C19044B.6010602@codemonkey.ws> (raw)
In-Reply-To: <m3iq5jx884.fsf@trasno.mitica>
On 06/16/2010 11:17 AM, Juan Quintela wrote:
> Anthony Liguori<anthony@codemonkey.ws> wrote:
>
>> On 06/16/2010 08:11 AM, Juan Quintela wrote:
>>
>
>> It's only ensured if you've got the same disk image running on another
>> machine. Considering that we support migrating from a file and we
>> support migrating block devices, I don't think it's practical.
>>
>>
>>> - outgoing migration
>>>
>>> After sucessful migration, we can issue "cont" command in source, and
>>> having source and target running at the same time -> disk corruption
>>> again.
>>>
>>> My suggestion:
>>> - add a third state "incoming", and cont/stop don't work on that state
>>> - add a fourth state "migrated", and "cont" gives an explicit error, and you
>>> have to run "cont --force" or "cont" twice (whatever) to get it to continue.
>>>
>>>
>> Very few users are going to do manual migration like this and those
>> that do have no good reason to execute cont in either of these
>> scenarios.
>>
> as of today, libvirt uses it (guess who filled that bug to me).
>
libvirt is not a human so I fail to see how forcing it to use a --force
option would help them.
Either we didn't document migration well enough or their developers are
not careful enough. Considering our lack of documentation, I'm sure it
was the former.
>> A --force command like this is equivalent to popping up a
>> message box saying "are you sure you really want to do this" which
>> most users find to be extremely annoying.
>>
> I had to debug this one from testers/field. They were testing things
> and it was very "practical" to launch guest on machine A, configure
> whatever they wanted, migrate to machine B. test whatever on machine B.
> back to machine A, continue.
>
Honestly, that's a terrible testing strategy. You cannot just execute
random commands and hope nothing bad happens.
> You can guess what happened. The problem here is that qemu is not
> giving user the _minimal_ advise that something could go wrong. And it
> is not going to be wrong, it is going to cause disk corruption for sure :(
>
>
>> We should try to inform users when it's likely that they'll stumble
>> upon a dangerous action. cache=volatile is a good example of this
>> because a user could have used it pretty easily and it's a reasonable
>> expectation that we wouldn't expose a feature that could lead to
>> corruption in obscure cases.
>>
> This is not _so_ obscure if you run qemu by hand :(
> you have a nice "(qemu)" prompt, and if you issue "cont", bad things happen.
>
And if you issue system_reset, quit, commit, loadvm, pci_del, or any set
of commands bad things can happen including some form of data loss or
corruption.
IMHO, there's a significant difference between twiddling something where
there is a reasonable expectation that the impact is only going to be
related to performance (like -smp X, -m X, or cache=X) and just trying
random things.
>> If a user executes cont in either of these scenarios and has two
>> copies of a virtual machine running accessing the same resources, then
>> they surely ought to expect bad behavior.
>>
> It is not _so_ easy O:-).
> Consider the example that I showed you:
>
> (host A) (host B)
> launch qemu launch qemu -incoming
> migrate host B
> .....
> do your things
> exit/poweroff/...
>
> At this point you have a qemu launched on machine A, with nothing on
> machine B. running "cont" on machine A, have disastreus consecuences,
> and there is no way to prevent it :(
>
If there was a reasonable belief that it wouldn't result in disaster, I
would fully support you. However, I can't think of any rational reason
why someone would do this. I can't think of a better analogy to
shooting yourself in the foot.
> As I have received this bug from users a couple of times, I would like
> to be able to prevent this case.
>
I've never seen anyone hit run into this before. Can you show me a bug
report? I'd love to see how someone expected this to behave.
Regards,
Anthony Liguori
> Later, Juan.
>
next prev parent reply other threads:[~2010-06-16 17:05 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-15 16:30 [Qemu-devel] [CFR 0/10] QMP specification review Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 1/10] qmp: balloon command Anthony Liguori
2010-06-15 16:42 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 2/10] qmp: block_passwd command Anthony Liguori
2010-06-15 16:44 ` Anthony Liguori
2010-06-16 13:33 ` Kevin Wolf
2010-06-16 13:57 ` Daniel P. Berrange
2010-06-15 16:30 ` [Qemu-devel] [CFR 3/10] command Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 4/10] command Anthony Liguori
2010-06-15 17:02 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 5/10] closefd command Anthony Liguori
2010-06-15 16:45 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 6/10] cont command Anthony Liguori
2010-06-15 16:46 ` Anthony Liguori
2010-06-16 13:11 ` [Qemu-devel] " Juan Quintela
2010-06-16 13:47 ` Anthony Liguori
2010-06-16 16:17 ` Juan Quintela
2010-06-16 17:05 ` Anthony Liguori [this message]
2010-06-16 17:22 ` Jamie Lokier
2010-06-16 22:25 ` Juan Quintela
2010-06-16 16:25 ` Daniel P. Berrange
2010-06-16 17:18 ` Anthony Liguori
2010-06-16 22:05 ` Juan Quintela
2010-06-16 22:26 ` Anthony Liguori
2010-06-16 23:00 ` Juan Quintela
2010-06-15 16:30 ` [Qemu-devel] [CFR 7/10] cpu command Anthony Liguori
2010-06-15 16:59 ` Anthony Liguori
2010-06-15 17:00 ` [Qemu-devel] " Jan Kiszka
2010-06-15 16:30 ` [Qemu-devel] [CFR 8/10] device_add command Anthony Liguori
2010-06-15 16:49 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 9/10] device_del command Anthony Liguori
2010-06-15 16:49 ` Anthony Liguori
2010-06-15 16:59 ` [Qemu-devel] " Jan Kiszka
2010-06-15 20:48 ` Miguel Di Ciurcio Filho
2010-06-15 21:14 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 10/10] eject command Anthony Liguori
2010-06-17 14:49 ` [Qemu-devel] Re: [CFR 0/10] QMP specification review Luiz Capitulino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C19044B.6010602@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=armbru@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).