From: Anthony Liguori <anthony@codemonkey.ws>
To: Juan Quintela <quintela@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>,
qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
Luiz Capitulino <lcapitulino@redhat.com>
Subject: [Qemu-devel] Re: [CFR 6/10] cont command
Date: Wed, 16 Jun 2010 12:05:15 -0500 [thread overview]
Message-ID: <4C19044B.6010602@codemonkey.ws> (raw)
In-Reply-To: <m3iq5jx884.fsf@trasno.mitica>
On 06/16/2010 11:17 AM, Juan Quintela wrote:
> Anthony Liguori<anthony@codemonkey.ws> wrote:
>
>> On 06/16/2010 08:11 AM, Juan Quintela wrote:
>>
>
>> It's only ensured if you've got the same disk image running on another
>> machine. Considering that we support migrating from a file and we
>> support migrating block devices, I don't think it's practical.
>>
>>
>>> - outgoing migration
>>>
>>> After sucessful migration, we can issue "cont" command in source, and
>>> having source and target running at the same time -> disk corruption
>>> again.
>>>
>>> My suggestion:
>>> - add a third state "incoming", and cont/stop don't work on that state
>>> - add a fourth state "migrated", and "cont" gives an explicit error, and you
>>> have to run "cont --force" or "cont" twice (whatever) to get it to continue.
>>>
>>>
>> Very few users are going to do manual migration like this and those
>> that do have no good reason to execute cont in either of these
>> scenarios.
>>
> as of today, libvirt uses it (guess who filled that bug to me).
>
libvirt is not a human so I fail to see how forcing it to use a --force
option would help them.
Either we didn't document migration well enough or their developers are
not careful enough. Considering our lack of documentation, I'm sure it
was the former.
>> A --force command like this is equivalent to popping up a
>> message box saying "are you sure you really want to do this" which
>> most users find to be extremely annoying.
>>
> I had to debug this one from testers/field. They were testing things
> and it was very "practical" to launch guest on machine A, configure
> whatever they wanted, migrate to machine B. test whatever on machine B.
> back to machine A, continue.
>
Honestly, that's a terrible testing strategy. You cannot just execute
random commands and hope nothing bad happens.
> You can guess what happened. The problem here is that qemu is not
> giving user the _minimal_ advise that something could go wrong. And it
> is not going to be wrong, it is going to cause disk corruption for sure :(
>
>
>> We should try to inform users when it's likely that they'll stumble
>> upon a dangerous action. cache=volatile is a good example of this
>> because a user could have used it pretty easily and it's a reasonable
>> expectation that we wouldn't expose a feature that could lead to
>> corruption in obscure cases.
>>
> This is not _so_ obscure if you run qemu by hand :(
> you have a nice "(qemu)" prompt, and if you issue "cont", bad things happen.
>
And if you issue system_reset, quit, commit, loadvm, pci_del, or any set
of commands bad things can happen including some form of data loss or
corruption.
IMHO, there's a significant difference between twiddling something where
there is a reasonable expectation that the impact is only going to be
related to performance (like -smp X, -m X, or cache=X) and just trying
random things.
>> If a user executes cont in either of these scenarios and has two
>> copies of a virtual machine running accessing the same resources, then
>> they surely ought to expect bad behavior.
>>
> It is not _so_ easy O:-).
> Consider the example that I showed you:
>
> (host A) (host B)
> launch qemu launch qemu -incoming
> migrate host B
> .....
> do your things
> exit/poweroff/...
>
> At this point you have a qemu launched on machine A, with nothing on
> machine B. running "cont" on machine A, have disastreus consecuences,
> and there is no way to prevent it :(
>
If there was a reasonable belief that it wouldn't result in disaster, I
would fully support you. However, I can't think of any rational reason
why someone would do this. I can't think of a better analogy to
shooting yourself in the foot.
> As I have received this bug from users a couple of times, I would like
> to be able to prevent this case.
>
I've never seen anyone hit run into this before. Can you show me a bug
report? I'd love to see how someone expected this to behave.
Regards,
Anthony Liguori
> Later, Juan.
>
next prev parent reply other threads:[~2010-06-16 17:05 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-15 16:30 [Qemu-devel] [CFR 0/10] QMP specification review Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 1/10] qmp: balloon command Anthony Liguori
2010-06-15 16:42 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 2/10] qmp: block_passwd command Anthony Liguori
2010-06-15 16:44 ` Anthony Liguori
2010-06-16 13:33 ` Kevin Wolf
2010-06-16 13:57 ` Daniel P. Berrange
2010-06-15 16:30 ` [Qemu-devel] [CFR 3/10] command Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 4/10] command Anthony Liguori
2010-06-15 17:02 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 5/10] closefd command Anthony Liguori
2010-06-15 16:45 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 6/10] cont command Anthony Liguori
2010-06-15 16:46 ` Anthony Liguori
2010-06-16 13:11 ` [Qemu-devel] " Juan Quintela
2010-06-16 13:47 ` Anthony Liguori
2010-06-16 16:17 ` Juan Quintela
2010-06-16 17:05 ` Anthony Liguori [this message]
2010-06-16 17:22 ` Jamie Lokier
2010-06-16 22:25 ` Juan Quintela
2010-06-16 16:25 ` Daniel P. Berrange
2010-06-16 17:18 ` Anthony Liguori
2010-06-16 22:05 ` Juan Quintela
2010-06-16 22:26 ` Anthony Liguori
2010-06-16 23:00 ` Juan Quintela
2010-06-15 16:30 ` [Qemu-devel] [CFR 7/10] cpu command Anthony Liguori
2010-06-15 16:59 ` Anthony Liguori
2010-06-15 17:00 ` [Qemu-devel] " Jan Kiszka
2010-06-15 16:30 ` [Qemu-devel] [CFR 8/10] device_add command Anthony Liguori
2010-06-15 16:49 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 9/10] device_del command Anthony Liguori
2010-06-15 16:49 ` Anthony Liguori
2010-06-15 16:59 ` [Qemu-devel] " Jan Kiszka
2010-06-15 20:48 ` Miguel Di Ciurcio Filho
2010-06-15 21:14 ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 10/10] eject command Anthony Liguori
2010-06-17 14:49 ` [Qemu-devel] Re: [CFR 0/10] QMP specification review Luiz Capitulino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C19044B.6010602@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=armbru@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.