[Qemu-devel] Re: [CFR 6/10] cont command

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Anthony Liguori <anthony@codemonkey.ws>
To: Juan Quintela <quintela@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Luiz Capitulino <lcapitulino@redhat.com>
Subject: [Qemu-devel] Re: [CFR 6/10] cont command
Date: Wed, 16 Jun 2010 12:05:15 -0500	[thread overview]
Message-ID: <4C19044B.6010602@codemonkey.ws> (raw)
In-Reply-To: <m3iq5jx884.fsf@trasno.mitica>

On 06/16/2010 11:17 AM, Juan Quintela wrote:
> Anthony Liguori<anthony@codemonkey.ws>  wrote:
>    
>> On 06/16/2010 08:11 AM, Juan Quintela wrote:
>>      
>    
>> It's only ensured if you've got the same disk image running on another
>> machine.  Considering that we support migrating from a file and we
>> support migrating block devices, I don't think it's practical.
>>
>>      
>>> - outgoing migration
>>>
>>> After sucessful migration, we can issue "cont" command in source, and
>>> having source and target running at the same time ->   disk corruption
>>> again.
>>>
>>> My suggestion:
>>> - add a third state "incoming", and cont/stop don't work on that state
>>> - add a fourth state "migrated", and "cont" gives an explicit error, and you
>>>     have to run "cont --force" or "cont" twice (whatever) to get it to continue.
>>>
>>>        
>> Very few users are going to do manual migration like this and those
>> that do have no good reason to execute cont in either of these
>> scenarios.
>>      
> as of today, libvirt uses it (guess who filled that bug to me).
>    

libvirt is not a human so I fail to see how forcing it to use a --force 
option would help them.

Either we didn't document migration well enough or their developers are 
not careful enough.  Considering our lack of documentation, I'm sure it 
was the former.

>>   A --force command like this is equivalent to popping up a
>> message box saying "are you sure you really want to do this" which
>> most users find to be extremely annoying.
>>      
> I had to debug this one from testers/field.  They were testing things
> and it was very "practical" to launch guest on machine A, configure
> whatever they wanted, migrate to machine B.  test whatever on machine B.
> back to machine A, continue.
>    

Honestly, that's a terrible testing strategy.  You cannot just execute 
random commands and hope nothing bad happens.

> You can guess what happened.  The problem here is that qemu is not
> giving user the _minimal_ advise that something could go wrong.  And it
> is not going to be wrong, it is going to cause disk corruption for sure :(
>
>    
>> We should try to inform users when it's likely that they'll stumble
>> upon a dangerous action.  cache=volatile is a good example of this
>> because a user could have used it pretty easily and it's a reasonable
>> expectation that we wouldn't expose a feature that could lead to
>> corruption in obscure cases.
>>      
> This is not _so_ obscure if you run qemu by hand :(
> you have a nice "(qemu)" prompt, and if you issue "cont", bad things happen.
>    

And if you issue system_reset, quit, commit, loadvm, pci_del, or any set 
of commands bad things can happen including some form of data loss or 
corruption.

IMHO, there's a significant difference between twiddling something where 
there is a reasonable expectation that the impact is only going to be 
related to performance (like -smp X, -m X, or cache=X) and just trying 
random things.

>> If a user executes cont in either of these scenarios and has two
>> copies of a virtual machine running accessing the same resources, then
>> they surely ought to expect bad behavior.
>>      
> It is not _so_ easy O:-).
> Consider the example that I showed you:
>
> (host A)		(host B)
> launch qemu             launch qemu -incoming
> migrate host B
>                          .....
>                          do your things
>                          exit/poweroff/...
>
> At this point you have a qemu launched on machine A, with nothing on
> machine B.  running "cont" on machine A, have disastreus consecuences,
> and there is no way to prevent it :(
>    

If there was a reasonable belief that it wouldn't result in disaster, I 
would fully support you.  However, I can't think of any rational reason 
why someone would do this.  I can't think of a better analogy to 
shooting yourself in the foot.

> As I have received this bug from users a couple of times, I would like
> to be able to prevent this case.
>    

I've never seen anyone hit run into this before.  Can you show me a bug 
report?  I'd love to see how someone expected this to behave.

Regards,

Anthony Liguori

> Later, Juan.
>

next prev parent reply	other threads:[~2010-06-16 17:05 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-15 16:30 [Qemu-devel] [CFR 0/10] QMP specification review Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 1/10] qmp: balloon command Anthony Liguori
2010-06-15 16:42   ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 2/10] qmp: block_passwd command Anthony Liguori
2010-06-15 16:44   ` Anthony Liguori
2010-06-16 13:33     ` Kevin Wolf
2010-06-16 13:57       ` Daniel P. Berrange
2010-06-15 16:30 ` [Qemu-devel] [CFR 3/10] command Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 4/10] command Anthony Liguori
2010-06-15 17:02   ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 5/10] closefd command Anthony Liguori
2010-06-15 16:45   ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 6/10] cont command Anthony Liguori
2010-06-15 16:46   ` Anthony Liguori
2010-06-16 13:11   ` [Qemu-devel] " Juan Quintela
2010-06-16 13:47     ` Anthony Liguori
2010-06-16 16:17       ` Juan Quintela
2010-06-16 17:05         ` Anthony Liguori [this message]
2010-06-16 17:22           ` Jamie Lokier
2010-06-16 22:25           ` Juan Quintela
2010-06-16 16:25     ` Daniel P. Berrange
2010-06-16 17:18       ` Anthony Liguori
2010-06-16 22:05         ` Juan Quintela
2010-06-16 22:26           ` Anthony Liguori
2010-06-16 23:00             ` Juan Quintela
2010-06-15 16:30 ` [Qemu-devel] [CFR 7/10] cpu command Anthony Liguori
2010-06-15 16:59   ` Anthony Liguori
2010-06-15 17:00   ` [Qemu-devel] " Jan Kiszka
2010-06-15 16:30 ` [Qemu-devel] [CFR 8/10] device_add command Anthony Liguori
2010-06-15 16:49   ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 9/10] device_del command Anthony Liguori
2010-06-15 16:49   ` Anthony Liguori
2010-06-15 16:59   ` [Qemu-devel] " Jan Kiszka
2010-06-15 20:48     ` Miguel Di Ciurcio Filho
2010-06-15 21:14       ` Anthony Liguori
2010-06-15 16:30 ` [Qemu-devel] [CFR 10/10] eject command Anthony Liguori
2010-06-17 14:49 ` [Qemu-devel] Re: [CFR 0/10] QMP specification review Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C19044B.6010602@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=armbru@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).