qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG
@ 2017-02-16 14:51 Janosch Frank
  2017-02-20  8:05 ` Christian Borntraeger
  2017-02-20 13:46 ` Paolo Bonzini
  0 siblings, 2 replies; 5+ messages in thread
From: Janosch Frank @ 2017-02-16 14:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

While trying to fix a bug in the s390 migration code, I noticed that
QEMU ignores practically all errors returned from that VM ioctl. QEMU
behaves as specified in the KVM api and only processes -1 (-EPERM) as an
error.

Unfortunately the documentation is wrong/old and KVM may return -EFAULT,
-EINVAL, -ENOTSUPP (BookE) and -ENOENT. This bugs me, as I found a case
where I want to return -EFAULT because of guest memory problems and QEMU
will still happily migrate the VM.

I currently don't see a reason why we continue to migrate on EFAULT and
EINVAL. But returning -error from kvm_physical_sync_dirty_bitmap might
also a bit hard, as it kills QEMU.

Do we want to fix this and if, how do we want it done?
If not we at least have a definitive mail to point to when the next one
comes around. I also have a KVM patch to update the api documentation if
wanted (maybe we should dust that off a bit anyhow).


This has been brought up in 2009 [1] the first time and was more or less
fixed and then reverted in 2014 [2].

The reason in [1] was that PPC hadn't settled yet on a valid return code.

In [2] it was too close to the v2 to handle it properly.


[1] https://lists.nongnu.org/archive/html/qemu-devel/2009-07/msg01772.html

[2] https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg01993.html


Cheers,
Janosch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG
  2017-02-16 14:51 [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG Janosch Frank
@ 2017-02-20  8:05 ` Christian Borntraeger
  2017-02-20  9:39   ` Dr. David Alan Gilbert
  2017-02-20 13:46 ` Paolo Bonzini
  1 sibling, 1 reply; 5+ messages in thread
From: Christian Borntraeger @ 2017-02-20  8:05 UTC (permalink / raw)
  To: Janosch Frank, qemu-devel; +Cc: Paolo Bonzini

On 02/16/2017 03:51 PM, Janosch Frank wrote:
> While trying to fix a bug in the s390 migration code, I noticed that
> QEMU ignores practically all errors returned from that VM ioctl. QEMU
> behaves as specified in the KVM api and only processes -1 (-EPERM) as an
> error.
> 
> Unfortunately the documentation is wrong/old and KVM may return -EFAULT,
> -EINVAL, -ENOTSUPP (BookE) and -ENOENT. This bugs me, as I found a case
> where I want to return -EFAULT because of guest memory problems and QEMU
> will still happily migrate the VM.
> 
> I currently don't see a reason why we continue to migrate on EFAULT and
> EINVAL. But returning -error from kvm_physical_sync_dirty_bitmap might
> also a bit hard, as it kills QEMU.
> 
> Do we want to fix this and if, how do we want it done?
> If not we at least have a definitive mail to point to when the next one
> comes around. I also have a KVM patch to update the api documentation if
> wanted (maybe we should dust that off a bit anyhow).

I think we want to handle _ALL_ error of that ioctl. Instead of aborting
QEMU we might just want to abort the migration  in that case?


> 
> 
> This has been brought up in 2009 [1] the first time and was more or less
> fixed and then reverted in 2014 [2].
> 
> The reason in [1] was that PPC hadn't settled yet on a valid return code.
> 
> In [2] it was too close to the v2 to handle it properly.
> 
> 
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2009-07/msg01772.html
> 
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg01993.html

So back then it was just too close to 2.0 and should have been revisited for 
2.1. Lets now fix it for 2.9?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG
  2017-02-20  8:05 ` Christian Borntraeger
@ 2017-02-20  9:39   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2017-02-20  9:39 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: Janosch Frank, qemu-devel, Paolo Bonzini

* Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> On 02/16/2017 03:51 PM, Janosch Frank wrote:
> > While trying to fix a bug in the s390 migration code, I noticed that
> > QEMU ignores practically all errors returned from that VM ioctl. QEMU
> > behaves as specified in the KVM api and only processes -1 (-EPERM) as an
> > error.
> > 
> > Unfortunately the documentation is wrong/old and KVM may return -EFAULT,
> > -EINVAL, -ENOTSUPP (BookE) and -ENOENT. This bugs me, as I found a case
> > where I want to return -EFAULT because of guest memory problems and QEMU
> > will still happily migrate the VM.
> > 
> > I currently don't see a reason why we continue to migrate on EFAULT and
> > EINVAL. But returning -error from kvm_physical_sync_dirty_bitmap might
> > also a bit hard, as it kills QEMU.
> > 
> > Do we want to fix this and if, how do we want it done?
> > If not we at least have a definitive mail to point to when the next one
> > comes around. I also have a KVM patch to update the api documentation if
> > wanted (maybe we should dust that off a bit anyhow).
> 
> I think we want to handle _ALL_ error of that ioctl. Instead of aborting
> QEMU we might just want to abort the migration  in that case?

Yes, I don't see any reason to kill the source guest.

> > This has been brought up in 2009 [1] the first time and was more or less
> > fixed and then reverted in 2014 [2].
> > 
> > The reason in [1] was that PPC hadn't settled yet on a valid return code.
> > 
> > In [2] it was too close to the v2 to handle it properly.
> > 
> > 
> > [1] https://lists.nongnu.org/archive/html/qemu-devel/2009-07/msg01772.html
> > 
> > [2] https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg01993.html
> 
> So back then it was just too close to 2.0 and should have been revisited for 
> 2.1. Lets now fix it for 2.9?

Yes

Dave

> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG
  2017-02-16 14:51 [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG Janosch Frank
  2017-02-20  8:05 ` Christian Borntraeger
@ 2017-02-20 13:46 ` Paolo Bonzini
  2017-02-20 14:33   ` Janosch Frank
  1 sibling, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2017-02-20 13:46 UTC (permalink / raw)
  To: Janosch Frank, qemu-devel



On 16/02/2017 15:51, Janosch Frank wrote:
> While trying to fix a bug in the s390 migration code, I noticed that
> QEMU ignores practically all errors returned from that VM ioctl. QEMU
> behaves as specified in the KVM api and only processes -1 (-EPERM) as an
> error.
> 
> Unfortunately the documentation is wrong/old and KVM may return -EFAULT,
> -EINVAL, -ENOTSUPP (BookE) and -ENOENT. This bugs me, as I found a case
> where I want to return -EFAULT because of guest memory problems and QEMU
> will still happily migrate the VM.

Guest memory problems should not return EFAULT, which corresponds to a
wrong address passed to KVM_GET_DIRTY_LOG.  In fact, EFAULT is probably
the only case where an assertion is warranted---just like you passed a
wrong pointer to KVM_GET_DIRTY_LOG, who knows who else is going to get
that pointer.

ENOENT and EINVAL should not kill the source guest, though they should
terminate migration.  But then I would like to know more about this
case, because they should never happen unless KVMMemoryListener is buggy.

Paolo

> I currently don't see a reason why we continue to migrate on EFAULT and
> EINVAL. But returning -error from kvm_physical_sync_dirty_bitmap might
> also a bit hard, as it kills QEMU.
> 
> Do we want to fix this and if, how do we want it done?
> If not we at least have a definitive mail to point to when the next one
> comes around. I also have a KVM patch to update the api documentation if
> wanted (maybe we should dust that off a bit anyhow).
> 
> 
> This has been brought up in 2009 [1] the first time and was more or less
> fixed and then reverted in 2014 [2].
> 
> The reason in [1] was that PPC hadn't settled yet on a valid return code.
> 
> In [2] it was too close to the v2 to handle it properly.
> 
> 
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2009-07/msg01772.html
> 
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg01993.html
> 
> 
> Cheers,
> Janosch
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG
  2017-02-20 13:46 ` Paolo Bonzini
@ 2017-02-20 14:33   ` Janosch Frank
  0 siblings, 0 replies; 5+ messages in thread
From: Janosch Frank @ 2017-02-20 14:33 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: Christian Borntraeger

On 20.02.2017 14:46, Paolo Bonzini wrote:
> 
> 
> On 16/02/2017 15:51, Janosch Frank wrote:
>> While trying to fix a bug in the s390 migration code, I noticed that
>> QEMU ignores practically all errors returned from that VM ioctl. QEMU
>> behaves as specified in the KVM api and only processes -1 (-EPERM) as an
>> error.
>>
>> Unfortunately the documentation is wrong/old and KVM may return -EFAULT,
>> -EINVAL, -ENOTSUPP (BookE) and -ENOENT. This bugs me, as I found a case
>> where I want to return -EFAULT because of guest memory problems and QEMU
>> will still happily migrate the VM.
> 
> Guest memory problems should not return EFAULT, which corresponds to a
> wrong address passed to KVM_GET_DIRTY_LOG.  In fact, EFAULT is probably
> the only case where an assertion is warranted---just like you passed a
> wrong pointer to KVM_GET_DIRTY_LOG, who knows who else is going to get
> that pointer.
> 
> ENOENT and EINVAL should not kill the source guest, though they should
> terminate migration.  But then I would like to know more about this
> case, because they should never happen unless KVMMemoryListener is buggy.

It is currently possible to start a hugetlbfs guest on s390 although we
don't have any huge page support. When QEMU starts the VM, it will get a
lot of errors back and pause the VM. When this VM is then migrated, the
host will do pte dirty handling on huge pages in
kvm_s390_sync_dirty_log/test_and_clear_guest_dirty

Running into such a huge page would be a guest memory error, so EINVAL
it is.

I'll post the patches in a bit to give a bit more context.

> 
> Paolo
> 
>> I currently don't see a reason why we continue to migrate on EFAULT and
>> EINVAL. But returning -error from kvm_physical_sync_dirty_bitmap might
>> also a bit hard, as it kills QEMU.
>>
>> Do we want to fix this and if, how do we want it done?
>> If not we at least have a definitive mail to point to when the next one
>> comes around. I also have a KVM patch to update the api documentation if
>> wanted (maybe we should dust that off a bit anyhow).
>>
>>
>> This has been brought up in 2009 [1] the first time and was more or less
>> fixed and then reverted in 2014 [2].
>>
>> The reason in [1] was that PPC hadn't settled yet on a valid return code.
>>
>> In [2] it was too close to the v2 to handle it properly.
>>
>>
>> [1] https://lists.nongnu.org/archive/html/qemu-devel/2009-07/msg01772.html
>>
>> [2] https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg01993.html
>>
>>
>> Cheers,
>> Janosch
>>
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-02-20 14:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-16 14:51 [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG Janosch Frank
2017-02-20  8:05 ` Christian Borntraeger
2017-02-20  9:39   ` Dr. David Alan Gilbert
2017-02-20 13:46 ` Paolo Bonzini
2017-02-20 14:33   ` Janosch Frank

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).