xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* XSA-60 - how to get back to a sane state
@ 2013-12-02 14:28 Jan Beulich
  2013-12-02 19:43 ` George Dunlap
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2013-12-02 14:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Jinsong Liu, Keir Fraser, George Dunlap, Andrew Cooper,
	Zhenzhong Duan, Donald D Dugger, Jun Nakajima

[-- Attachment #1: Type: text/plain, Size: 2192 bytes --]

All,

Jinsong's patches having been in for nearly a month now, but not
being in a shape that would make releasing in 4.4 or backporting to
the older trees desirable, we need to come to a conclusion on
which way to go. Currently it looks like we have three options, but
of course I'll be happy to see other (better!) ones proposed.

1) Stay with what we have.

2) Revert 86d60e85 ("VMX: flush cache when vmentry back to UC
guest") in its entirety plus, perhaps, the change 62652c00 ("VMX:
fix cr0.cd handling") did to vmx_ctxt_switch_to().

3) Apply the attached patch that Andrew and I have been putting
together, with the caveat that it's still incomplete (see below).

The latter two are based on the observation that the amount of
cache flushing we do with what is in the master tree right now is
more than what we did prior to that patch series but still
insufficient. Hence the revert would get us back to the earlier
state (and obviously eliminate the performance problems that
were observed when doing too eager flushing), whereas
applying the extra 5th patch would get us closer to a proper
solution.

The problem with that new patch is it's - as said - still incomplete,
yet setting ->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory
for all writes through map_domain_page_global() mappings would
get rather ugly:
- guest_walk_tables() (accumulating set_ad_bits() results)
- all writes through vcpu_info()
- event_fifo.c would need this scattered around
- hvm_load_segment_selector()'s setting of the accessed bit
- hvm_task_switch()'s handling of the busy bit
- all the nested HVM code's writes through ->nv_vvmcx

Plus the performance effects of it aren't really known yet (all we
do know is that the too eager "flush always prior to VM entry" is
causing unacceptable stalls).

Otoh we have had no reports that the lack of proper cache
flushing actually caused any problems, and hence reverting the
partial flushing introduced isn't likely to push new problems onto
people. Yet if, ever, such a problem would surface, it can be
expected to be very hard to diagnose.

Please, everyone having an opinion here - voice it.

Thanks, Jan


[-- Attachment #2: xsa60-5 --]
[-- Type: application/octet-stream, Size: 3415 bytes --]

VMX: corrections to XSA-60 fix

c/s 86d60e855fe118df0dbdf67b67b1a0ec8fdb9f0d does not cause a wbinvd() in
certain cases, such as writing the hypercall page, which performs:

    __map_domain_page(page);
    hypercall_page_initialise(d, hypercall_page);
    unmap_domain_page(hypercall_page);

And bypasses the hooks in __hvm_{copy,clear}() looking for hypervisor
modification of domain memory.

Move the hook into unmap_domain_page() so it caches more domain memory
modification points.

Furthermore, the currently unconditional wbinvd() in vmx_ctxt_switch_to()
should be deferred until vmentry.  This way, a ctx_switch_to() followed by an
iteration of a hypercall continuation wont hit Xen with two wbinvd()s.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Fix the changes to unmap_domain_page(), addressing a boot time
dereference of the not yet properly set "current".

Also set the flag upon completion of I/O requests handed to the device
model.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -166,19 +166,21 @@ void *map_domain_page(unsigned long mfn)
 void unmap_domain_page(const void *ptr)
 {
     unsigned int idx;
-    struct vcpu *v;
+    struct vcpu *v = mapcache_current_vcpu();
     struct mapcache_domain *dcache;
     unsigned long va = (unsigned long)ptr, mfn, flags;
     struct vcpu_maphash_entry *hashent;
 
+    if ( unlikely(v && has_hvm_container_vcpu(v) &&
+                  v->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) )
+        v->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1;
+
     if ( va >= DIRECTMAP_VIRT_START )
         return;
 
     ASSERT(va >= MAPCACHE_VIRT_START && va < MAPCACHE_VIRT_END);
 
-    v = mapcache_current_vcpu();
     ASSERT(v && is_pv_vcpu(v));
-
     dcache = &v->domain->arch.pv_domain.mapcache;
     ASSERT(dcache->inuse);
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2566,9 +2566,6 @@ static enum hvm_copy_result __hvm_copy(
         return HVMCOPY_unhandleable;
 #endif
 
-    if ( unlikely(curr->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) )
-        curr->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1;
-
     while ( todo > 0 )
     {
         count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo);
@@ -2680,9 +2677,6 @@ static enum hvm_copy_result __hvm_clear(
         return HVMCOPY_unhandleable;
 #endif
 
-    if ( unlikely(curr->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) )
-        curr->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1;
-
     while ( todo > 0 )
     {
         count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo);
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -303,6 +303,9 @@ void hvm_io_assist(ioreq_t *p)
     {
         msix_write_completion(curr);
         vcpu_end_shutdown_deferral(curr);
+
+        if ( unlikely(curr->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) )
+            curr->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1;
     }
 }
 
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -645,7 +645,7 @@ static void vmx_ctxt_switch_to(struct vc
 
     /* For guest cr0.cd setting, do not use potentially polluted cache */
     if ( unlikely(v->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) )
-        wbinvd();
+        v->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1;
 
     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-12-04 16:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-02 14:28 XSA-60 - how to get back to a sane state Jan Beulich
2013-12-02 19:43 ` George Dunlap
2013-12-03  2:21   ` Andrew Cooper
2013-12-03  3:06     ` Liu, Jinsong
2013-12-03  8:00       ` Jan Beulich
2013-12-03 14:30         ` Liu, Jinsong
2013-12-03 15:09           ` Jan Beulich
2013-12-03 15:14             ` George Dunlap
2013-12-04 12:04             ` Liu, Jinsong
2013-12-04 12:16               ` Jan Beulich
2013-12-04 15:55                 ` George Dunlap
2013-12-04 16:09                   ` Jan Beulich
2013-12-04 16:03                 ` Liu, Jinsong
2013-12-04 16:14                   ` Jan Beulich
2013-12-04 16:23                     ` Liu, Jinsong
2013-12-03  7:57     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).