qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] memory: pause all vCPUs for the duration of memory transactions
@ 2020-10-26  8:49 Vitaly Kuznetsov
  2020-10-26 10:43 ` David Hildenbrand
  2020-11-02 19:57 ` Peter Xu
  0 siblings, 2 replies; 19+ messages in thread
From: Vitaly Kuznetsov @ 2020-10-26  8:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Laszlo Ersek, Eduardo Habkost, Peter Xu,
	Dr. David Alan Gilbert

Currently, KVM doesn't provide an API to make atomic updates to memmap when
the change touches more than one memory slot, e.g. in case we'd like to
punch a hole in an existing slot.

Reports are that multi-CPU Q35 VMs booted with OVMF sometimes print something
like

!!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000003 !!!!
ExceptionData - 0000000000000010  I:1 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
RIP  - 000000007E35FAB6, CS  - 0000000000000038, RFLAGS - 0000000000010006
RAX  - 0000000000000000, RCX - 000000007E3598F2, RDX - 00000000078BFBFF
...

The problem seems to be that TSEG manipulations on one vCPU are not atomic
from other vCPUs views. In particular, here's the strace:

Initial creation of the 'problematic' slot:

10085 ioctl(13, KVM_SET_USER_MEMORY_REGION, {slot=6, flags=0, guest_phys_addr=0x100000,
   memory_size=2146435072, userspace_addr=0x7fb89bf00000}) = 0

... and then the update (caused by e.g. mch_update_smram()) later:

10090 ioctl(13, KVM_SET_USER_MEMORY_REGION, {slot=6, flags=0, guest_phys_addr=0x100000,
   memory_size=0, userspace_addr=0x7fb89bf00000}) = 0
10090 ioctl(13, KVM_SET_USER_MEMORY_REGION, {slot=6, flags=0, guest_phys_addr=0x100000,
   memory_size=2129657856, userspace_addr=0x7fb89bf00000}) = 0

In case KVM has to handle any event on a different vCPU in between these
two calls the #PF will get triggered.

An ideal solution to the problem would probably require KVM to provide a
new API to do the whole transaction in one shot but as a band-aid we can
just pause all vCPUs to make memory transations atomic.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
RFC: Generally, memap updates happen only a few times during guest boot but
I'm not sure there are no scenarios when pausing all vCPUs is undesireable
from performance point of view. Also, I'm not sure if kvm_enabled() check
is needed.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 softmmu/memory.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index fa280a19f7f7..0bf6f3f6d5dc 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -28,6 +28,7 @@
 
 #include "exec/memory-internal.h"
 #include "exec/ram_addr.h"
+#include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
 #include "sysemu/runstate.h"
 #include "sysemu/tcg.h"
@@ -1057,7 +1058,9 @@ static void address_space_update_topology(AddressSpace *as)
 void memory_region_transaction_begin(void)
 {
     qemu_flush_coalesced_mmio_buffer();
-    ++memory_region_transaction_depth;
+    if ((++memory_region_transaction_depth == 1) && kvm_enabled()) {
+        pause_all_vcpus();
+    }
 }
 
 void memory_region_transaction_commit(void)
@@ -1087,7 +1090,11 @@ void memory_region_transaction_commit(void)
             }
             ioeventfd_update_pending = false;
         }
-   }
+
+        if (kvm_enabled()) {
+            resume_all_vcpus();
+        }
+    }
 }
 
 static void memory_region_destructor_none(MemoryRegion *mr)
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-11-05 16:36 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-26  8:49 [PATCH RFC] memory: pause all vCPUs for the duration of memory transactions Vitaly Kuznetsov
2020-10-26 10:43 ` David Hildenbrand
2020-10-26 11:17   ` David Hildenbrand
2020-10-27 12:36     ` Vitaly Kuznetsov
2020-10-27 12:42       ` David Hildenbrand
2020-10-27 13:02         ` Vitaly Kuznetsov
2020-10-27 13:08           ` David Hildenbrand
2020-10-27 13:19             ` Vitaly Kuznetsov
2020-10-27 13:35               ` David Hildenbrand
2020-10-27 13:47                 ` Vitaly Kuznetsov
2020-10-27 14:20                   ` Igor Mammedov
2020-11-02 19:57 ` Peter Xu
2020-11-03 13:07   ` Vitaly Kuznetsov
2020-11-03 16:37     ` Peter Xu
2020-11-04 18:09       ` Laszlo Ersek
2020-11-04 19:23         ` Peter Xu
2020-11-05 15:36           ` Vitaly Kuznetsov
2020-11-05 16:35             ` Peter Xu
2020-11-04 17:58     ` Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).