xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Dyasli <sergey.dyasli@citrix.com>
To: "JBeulich@suse.com" <JBeulich@suse.com>
Cc: Sergey Dyasli <sergey.dyasli@citrix.com>,
	Kevin Tian <kevin.tian@intel.com>,
	"Kevin.Mayer@gdata.de" <Kevin.Mayer@gdata.de>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Anshul Makkar <anshul.makkar@citrix.com>,
	"jun.nakajima@intel.com" <jun.nakajima@intel.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 1/2] VMX: fix VMCS race on context-switch paths
Date: Wed, 15 Feb 2017 11:48:53 +0000	[thread overview]
Message-ID: <1487159333.3588.5.camel@citrix.com> (raw)
In-Reply-To: <58A44BE8020000780013A2E2@prv-mh.provo.novell.com>

On Wed, 2017-02-15 at 04:39 -0700, Jan Beulich wrote:
> > > > On 15.02.17 at 11:27, <sergey.dyasli@citrix.com> wrote:
> > 
> > This is what I'm getting during the original test case (32 VMs reboot):
> > 
> > (XEN) [ 1407.789329] Watchdog timer detects that CPU12 is stuck!
> > (XEN) [ 1407.795726] ----[ Xen-4.6.1-xs-local  x86_64  debug=n  Not tainted ]----
> > (XEN) [ 1407.803774] CPU:    12
> > (XEN) [ 1407.806975] RIP:    e008:[<ffff82d0801ea2a2>] 
> > vmx_vmcs_reload+0x32/0x50
> > (XEN) [ 1407.814926] RFLAGS: 0000000000000013   CONTEXT: hypervisor (d230v0)
> > (XEN) [ 1407.822486] rax: 0000000000000000   rbx: ffff830079ee7000   rcx: 0000000000000000
> > (XEN) [ 1407.831407] rdx: 0000006f8f72ce00   rsi: ffff8329b3efbfe8   rdi: ffff830079ee7000
> > (XEN) [ 1407.840326] rbp: ffff83007bab7000   rsp: ffff83400fab7dc8   r8: 000001468e9e3ccc
> > (XEN) [ 1407.849246] r9:  ffff83403ffe7000   r10: 00000146c91c1737   r11: ffff833a9558c310
> > (XEN) [ 1407.858166] r12: ffff833a9558c000   r13: 000000000000000c   r14: ffff83403ffe7000
> > (XEN) [ 1407.867085] r15: ffff82d080364640   cr0: 0000000080050033   cr4: 00000000003526e0
> > (XEN) [ 1407.876005] cr3: 000000294b074000   cr2: 000007fefd7ce150
> > (XEN) [ 1407.882599] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> > (XEN) [ 1407.890938] Xen code around <ffff82d0801ea2a2> 
> > (vmx_vmcs_reload+0x32/0x50):
> > (XEN) [ 1407.899277]  84 00 00 00 00 00 f3 90 <83> bf e8 05 00 00 ff 75 f5 e9 a0 fa ff ff f3 c3
> > (XEN) [ 1407.908679] Xen stack trace from rsp=ffff83400fab7dc8:
> > (XEN) [ 1407.914982]    ffff82d08016c58d 0000000000001000 0000000000000000 0000000000000000
> > (XEN) [ 1407.923998]    0000000000000206 0000000000000086 0000000000000286 000000000000000c
> > (XEN) [ 1407.933017]    ffff83007bab7058 ffff82d080364640 ffff83007bab7000 00000146a7f26495
> > (XEN) [ 1407.942032]    ffff830079ee7000 ffff833a9558cf84 ffff833a9558c000 ffff82d080364640
> > (XEN) [ 1407.951048]    ffff82d08012fb8e ffff83400fabda98 ffff83400faba148 ffff83403ffe7000
> > (XEN) [ 1407.960067]    ffff83400faba160 ffff83400fabda40 ffff82d080164305 000000000000000c
> > (XEN) [ 1407.969083]    ffff830079ee7000 0000000001c9c380 ffff82d080136400 000000440000011d
> > (XEN) [ 1407.978101]    00000000ffffffff ffffffffffffffff ffff83400fab0000 ffff82d080348d00
> > (XEN) [ 1407.987116]    ffff833a9558c000 ffff82d080364640 ffff82d08013311c ffff830079ee7000
> > (XEN) [ 1407.996134]    ffff83400fab0000 ffff830079ee7000 ffff83403ffe7000 00000000ffffffff
> > (XEN) [ 1408.005151]    ffff82d080167d35 ffff83007bab7000 0000000000000001 fffffa80077f9700
> > (XEN) [ 1408.014167]    fffffa80075bf900 fffffa80077f9820 0000000000000000 0000000000000000
> > (XEN) [ 1408.023184]    fffffa8008889c00 0000000002fa1e78 0000003b6ed18d78 0000000000000000
> > (XEN) [ 1408.032202]    00000000068e7780 fffffa80075ba790 fffffa80077f9848 fffff800027f9e80
> > (XEN) [ 1408.041220]    0000000000000001 000000fc00000000 fffff880042499c2 0000000000000000
> > (XEN) [ 1408.050235]    0000000000000246 fffff80000b9cb58 0000000000000000 80248e00e008e1f0
> > (XEN) [ 1408.059253]    00000000ffff82d0 80248e00e008e200 00000000ffff82d0 80248e000000000c
> > (XEN) [ 1408.068268]    ffff830079ee7000 0000006f8f72ce00 00000000ffff82d0
> > (XEN) [ 1408.075638] Xen call trace:
> > (XEN) [ 1408.079322]    [<ffff82d0801ea2a2>] vmx_vmcs_reload+0x32/0x50
> > (XEN) [ 1408.086303]    [<ffff82d08016c58d>] context_switch+0x85d/0xeb0
> > (XEN) [ 1408.093380]    [<ffff82d08012fb8e>] schedule.c#schedule+0x46e/0x7d0
> > (XEN) [ 1408.100942]    [<ffff82d080164305>] reprogram_timer+0x75/0xe0
> > (XEN) [ 1408.107925]    [<ffff82d080136400>] timer.c#timer_softirq_action+0x90/0x210
> > (XEN) [ 1408.116263]    [<ffff82d08013311c>] softirq.c#__do_softirq+0x5c/0x90
> > (XEN) [ 1408.123921]    [<ffff82d080167d35>] domain.c#idle_loop+0x25/0x60
> 
> Taking your later reply into account - were you able to figure out
> what other party held onto the VMCS being waited for here?

Unfortunately, no. It was unclear from debug logs. But judging from
the following vmx_do_resume() code:

    if ( v->arch.hvm_vmx.active_cpu == smp_processor_id() )
    {
        if ( v->arch.hvm_vmx.vmcs_pa != this_cpu(current_vmcs) )
            vmx_load_vmcs(v);
    }

If both of the above conditions are true then vmx_vmcs_reload() will
probably hang.

-- 
Thanks,
Sergey
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-02-15 11:48 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-14 10:23 [PATCH 0/2] x86: context switch handling adjustments Jan Beulich
2017-02-14 10:28 ` [PATCH 1/2] VMX: fix VMCS race on context-switch paths Jan Beulich
2017-02-14 15:16   ` Andrew Cooper
2017-02-15  8:37     ` Jan Beulich
2017-02-15 11:29       ` Andrew Cooper
2017-02-15 10:27   ` Sergey Dyasli
2017-02-15 11:00     ` Jan Beulich
2017-02-15 11:13       ` Sergey Dyasli
2017-02-15 11:24         ` Jan Beulich
2017-02-15 11:39     ` Jan Beulich
2017-02-15 11:48       ` Sergey Dyasli [this message]
2017-02-15 11:55         ` Jan Beulich
2017-02-15 13:03           ` Jan Beulich
2017-02-15 13:40             ` Sergey Dyasli
2017-02-15 14:29               ` Jan Beulich
2017-02-15 14:44                 ` Jan Beulich
2017-02-15 13:20     ` Jan Beulich
2017-02-15 14:55       ` Sergey Dyasli
2017-02-15 15:15         ` Jan Beulich
2017-02-16  8:29           ` Sergey Dyasli
2017-02-16  9:26             ` Jan Beulich
2017-02-14 10:29 ` [PATCH 2/2] x86: package up context switch hook pointers Jan Beulich
2017-02-14 15:26   ` Andrew Cooper
2017-02-15  8:42     ` Jan Beulich
2017-02-15 11:34       ` Andrew Cooper
2017-02-15 11:40         ` Jan Beulich
2017-02-14 22:18   ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1487159333.3588.5.camel@citrix.com \
    --to=sergey.dyasli@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Kevin.Mayer@gdata.de \
    --cc=anshul.makkar@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).