Re: x86/AMD: Nested VM failed to boot L2 guest due to setting/clearing CR0.CD bit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Suravee Suthikulanit <suravee.suthikulpanit@amd.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Christoph Egger <chegger@amazon.de>,
	Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: x86/AMD: Nested VM failed to boot L2 guest due to setting/clearing CR0.CD bit
Date: Tue, 6 Aug 2013 12:55:55 -0500	[thread overview]
Message-ID: <520138AB.8040600@amd.com> (raw)
In-Reply-To: <5200BE0702000078000E9862@nat28.tlf.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 3514 bytes --]

On 8/6/2013 2:12 AM, Jan Beulich wrote:
>>>> On 06.08.13 at 04:27, Suravee Suthikulanit <suravee.suthikulpanit@amd.com>
> wrote:
>> Hi All,
>>
>> While I was testing nested VM on with latest Xen on AMD system, I am running
>> into issue where
>> the L2 guest (Linux) seems to stuck right after loading the kernel. When
>> using the "xl debug-keys d" to dump registers,
>> the L2 guest RIP always at the instruction which tries to write the CR0.CD
>> bit.  Besides, once starting L2 guest and it
>> got stuck, L0 Dom0 becomes very slow until I kill the L2 guest.
>>
>> After looking into the hvm code for handling CR0 (i.e.
>> xen/arch/x86/hvm/hvm.c: hvm_set_cr0()),
>> I see that the code tries to issue local cache flush on all the cores when
>> the L2 guest is
>> setting the CR0.CD bit. (Please see the code snippet below.)
>>
>>           if ( (value & X86_CR0_CD) && !(value & X86_CR0_NW) )
>>           {
>>               /* Entering no fill cache mode. */
>>               spin_lock(&v->domain->arch.hvm_domain.uc_lock);
>>               v->arch.hvm_vcpu.cache_mode = NO_FILL_CACHE_MODE;
>>
>>               if ( !v->domain->arch.hvm_domain.is_in_uc_mode )
>>               {
>>                   /* Flush physical caches. */
>> ---> HERE       on_each_cpu(local_flush_cache, NULL, 1);
>>                   hvm_set_uc_mode(v, 1);
>>               }
>>               spin_unlock(&v->domain->arch.hvm_domain.uc_lock);
>>           }
>>
>> When I try to comment out the line, the issue goes away.  Is this line
>> necessary?
>> Why do we need to flush all the cpu cores when the CR0.CD bit only applies
>> to a particular core?
> Doing the flush only on the local CPU would imply that once the
> affected vCPU migrates to another pCPU, flushing would _then_
> need to be done there too. Tracking this would clearly add
> complexity here.
>
> Furthermore, the "UC mode" is being entered on the domain as a
> whole, i.e. all the pCPU-s that the domain is actively running one
> would need immediate flushing, and all pCPU-s any of the vCPU-s
> would migrate to subsequently would need deferred
> flushing.
>
> That said, I still can't see how the flushing here would have this
> dramatic an effect: It's a one-time thing, when UC mode first gets
> entered by a domain. So unless CR0.CD gets flipped back and
> forth by a guest, there shouldn't be more than one flush (or there's
> a logic error somewhere else).
>
> Finally, the need for that code as a whole is under question in the
> context of XSA-60. I would certainly favor (at least on the SVM
> side) to handle CR0.CD per vCPU instead of per domain, as long
> as there are no requirements that CR0.CD be set consistently
> across multiple CPUs (e.g. within a package; on Intel CPUs I'm
> being told it's a hard requirement to be consistent at least
> between sibling hyperthreads, meaning that we can't rip out the
> current logic altogether in favor of a CR0.CD based solution).
>
> Jan
>
>
Somehow the problem went away when Iupdate the hypervisor in both L0
and L1, and I can no longer reproduce the issue. At one point when I was
trying to debug the issue using "hvm_debug", I was seeing the messages 
where the CD bit was flipped
back and forth.

(XEN) [HVM:1.3] <hvm_set_cr0> Update CR0 value = 8005003b
(XEN) [HVM:1.3] <hvm_set_cr0> Update CR0 value = c005003b
(XEN) [HVM:1.3] <hvm_set_cr0> Update CR0 value = 8005003b
(XEN) [HVM:1.3] <hvm_set_cr0> Update CR0 value = c005003b

Thanks for details. I'll keep monitoring this in the future.

Suravee

[-- Attachment #1.2: Type: text/html, Size: 5753 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

     prev parent reply	other threads:[~2013-08-06 17:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  2:27 x86/AMD: Nested VM failed to boot L2 guest due to setting/clearing CR0.CD bit Suravee Suthikulanit
2013-08-06  7:12 ` Jan Beulich
2013-08-06 17:55   ` Suravee Suthikulanit [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520138AB.8040600@amd.com \
    --to=suravee.suthikulpanit@amd.com \
    --cc=JBeulich@suse.com \
    --cc=chegger@amazon.de \
    --cc=jun.nakajima@intel.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.