On 8/6/2013 2:12 AM, Jan Beulich wrote: >>>> On 06.08.13 at 04:27, Suravee Suthikulanit > wrote: >> Hi All, >> >> While I was testing nested VM on with latest Xen on AMD system, I am running >> into issue where >> the L2 guest (Linux) seems to stuck right after loading the kernel. When >> using the "xl debug-keys d" to dump registers, >> the L2 guest RIP always at the instruction which tries to write the CR0.CD >> bit. Besides, once starting L2 guest and it >> got stuck, L0 Dom0 becomes very slow until I kill the L2 guest. >> >> After looking into the hvm code for handling CR0 (i.e. >> xen/arch/x86/hvm/hvm.c: hvm_set_cr0()), >> I see that the code tries to issue local cache flush on all the cores when >> the L2 guest is >> setting the CR0.CD bit. (Please see the code snippet below.) >> >> if ( (value & X86_CR0_CD) && !(value & X86_CR0_NW) ) >> { >> /* Entering no fill cache mode. */ >> spin_lock(&v->domain->arch.hvm_domain.uc_lock); >> v->arch.hvm_vcpu.cache_mode = NO_FILL_CACHE_MODE; >> >> if ( !v->domain->arch.hvm_domain.is_in_uc_mode ) >> { >> /* Flush physical caches. */ >> ---> HERE on_each_cpu(local_flush_cache, NULL, 1); >> hvm_set_uc_mode(v, 1); >> } >> spin_unlock(&v->domain->arch.hvm_domain.uc_lock); >> } >> >> When I try to comment out the line, the issue goes away. Is this line >> necessary? >> Why do we need to flush all the cpu cores when the CR0.CD bit only applies >> to a particular core? > Doing the flush only on the local CPU would imply that once the > affected vCPU migrates to another pCPU, flushing would _then_ > need to be done there too. Tracking this would clearly add > complexity here. > > Furthermore, the "UC mode" is being entered on the domain as a > whole, i.e. all the pCPU-s that the domain is actively running one > would need immediate flushing, and all pCPU-s any of the vCPU-s > would migrate to subsequently would need deferred > flushing. > > That said, I still can't see how the flushing here would have this > dramatic an effect: It's a one-time thing, when UC mode first gets > entered by a domain. So unless CR0.CD gets flipped back and > forth by a guest, there shouldn't be more than one flush (or there's > a logic error somewhere else). > > Finally, the need for that code as a whole is under question in the > context of XSA-60. I would certainly favor (at least on the SVM > side) to handle CR0.CD per vCPU instead of per domain, as long > as there are no requirements that CR0.CD be set consistently > across multiple CPUs (e.g. within a package; on Intel CPUs I'm > being told it's a hard requirement to be consistent at least > between sibling hyperthreads, meaning that we can't rip out the > current logic altogether in favor of a CR0.CD based solution). > > Jan > > Somehow the problem went away when Iupdate the hypervisor in both L0 and L1, and I can no longer reproduce the issue. At one point when I was trying to debug the issue using "hvm_debug", I was seeing the messages where the CD bit was flipped back and forth. (XEN) [HVM:1.3] Update CR0 value = 8005003b (XEN) [HVM:1.3] Update CR0 value = c005003b (XEN) [HVM:1.3] Update CR0 value = 8005003b (XEN) [HVM:1.3] Update CR0 value = c005003b Thanks for details. I'll keep monitoring this in the future. Suravee