[PATCH] VMX: sync CPU state upon vCPU destruction

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] VMX: sync CPU state upon vCPU destruction
@ 2017-11-09 14:49 Jan Beulich
  2017-11-09 15:02 ` Dario Faggioli
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Jan Beulich @ 2017-11-09 14:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Igor Druzhinin, Julien Grall, Dario Faggioli, Kevin Tian,
	Jun Nakajima

See the code comment being added for why we need this.

Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
      * we should disable PML manually here. Note that vmx_vcpu_destroy is called
      * prior to vmx_domain_destroy so we need to disable PML for each vcpu
      * separately here.
+     *
+     * Before doing that though, flush all state for the vCPU previously having
+     * run on the current CPU, so that this flushing of state won't happen from
+     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
+     * vmx_vmcs_exit() section.
      */
+    sync_local_execstate();
     vmx_vcpu_disable_pml(v);
     vmx_destroy_vmcs(v);
     passive_domain_destroy(v);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-09 14:49 [PATCH] VMX: sync CPU state upon vCPU destruction Jan Beulich
@ 2017-11-09 15:02 ` Dario Faggioli
  2017-11-10  8:41 ` Sergey Dyasli
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Dario Faggioli @ 2017-11-09 15:02 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Igor Druzhinin, Julien Grall, Kevin Tian, Jun Nakajima


[-- Attachment #1.1: Type: text/plain, Size: 489 bytes --]

On Thu, 2017-11-09 at 07:49 -0700, Jan Beulich wrote:
> See the code comment being added for why we need this.
> 
> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
Reviewed-by: Dario Faggioli <raistlin@linux.it>

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-09 14:49 [PATCH] VMX: sync CPU state upon vCPU destruction Jan Beulich
  2017-11-09 15:02 ` Dario Faggioli
@ 2017-11-10  8:41 ` Sergey Dyasli
  2017-11-10  9:50   ` Dario Faggioli
  2017-11-10 10:30   ` Jan Beulich
  2017-11-21 13:22 ` Ping: " Jan Beulich
  2017-11-21 16:26 ` Igor Druzhinin
  3 siblings, 2 replies; 17+ messages in thread
From: Sergey Dyasli @ 2017-11-10  8:41 UTC (permalink / raw)
  To: JBeulich@suse.com, xen-devel@lists.xenproject.org
  Cc: Igor Druzhinin, Kevin Tian, Sergey Dyasli, Andrew Cooper,
	julien.grall@arm.com, raistlin@linux.it, jun.nakajima@intel.com

On Thu, 2017-11-09 at 07:49 -0700, Jan Beulich wrote:
> See the code comment being added for why we need this.
> 
> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>       * we should disable PML manually here. Note that vmx_vcpu_destroy is called
>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>       * separately here.
> +     *
> +     * Before doing that though, flush all state for the vCPU previously having
> +     * run on the current CPU, so that this flushing of state won't happen from
> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
> +     * vmx_vmcs_exit() section.
>       */
> +    sync_local_execstate();
>      vmx_vcpu_disable_pml(v);
>      vmx_destroy_vmcs(v);
>      passive_domain_destroy(v);

This patch fixes only one particular issue and not the general problem.
What if vmcs is cleared, possibly in some future code, at another place?

The original intent of vmx_vmcs_reload() is correct: it lazily loads
the vmcs when it's needed. It's just the logic which checks for
v->is_running inside vmx_ctxt_switch_from() is flawed: v might be
"running" on another pCPU.

IMHO there are 2 possible solutions:

    1. Add additional pCPU check into vmx_ctxt_switch_from()
    2. Drop v->is_running check inside vmx_ctxt_switch_from() making
       vmx_vmcs_reload() unconditional.

Thanks,
Sergey
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-10  8:41 ` Sergey Dyasli
@ 2017-11-10  9:50   ` Dario Faggioli
  2017-11-10 10:30   ` Jan Beulich
  1 sibling, 0 replies; 17+ messages in thread
From: Dario Faggioli @ 2017-11-10  9:50 UTC (permalink / raw)
  To: Sergey Dyasli, JBeulich@suse.com, xen-devel@lists.xenproject.org
  Cc: Igor Druzhinin, Kevin Tian, julien.grall@arm.com,
	jun.nakajima@intel.com, Andrew Cooper


[-- Attachment #1.1: Type: text/plain, Size: 1495 bytes --]

On Fri, 2017-11-10 at 08:41 +0000, Sergey Dyasli wrote:
> On Thu, 2017-11-09 at 07:49 -0700, Jan Beulich wrote:
> > 
> This patch fixes only one particular issue and not the general
> problem.
> What if vmcs is cleared, possibly in some future code, at another
> place?
> 
Yes, that's what we were saying yesterday. Asynchronous code fiddling
with context, will have to make sure it syncs things properly. And we
need to keep an eye out for that.

> The original intent of vmx_vmcs_reload() is correct: it lazily loads
> the vmcs when it's needed. It's just the logic which checks for
> v->is_running inside vmx_ctxt_switch_from() is flawed: v might be
> "running" on another pCPU.
> 
> IMHO there are 2 possible solutions:
> 
>     1. Add additional pCPU check into vmx_ctxt_switch_from()
>
How? Checking v->processor is not an option as, AFAICS, this runs
without owning the scheduling lock, and hence processor can change
under your feet.

>     2. Drop v->is_running check inside vmx_ctxt_switch_from() making
>        vmx_vmcs_reload() unconditional.
> 
Well, that looks plausible to me. Although, I guess it will have, at
least potentially, an impact on performance (as other solutions
envisioned in the thread). Any idea how big the hit could be?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-10  8:41 ` Sergey Dyasli
  2017-11-10  9:50   ` Dario Faggioli
@ 2017-11-10 10:30   ` Jan Beulich
  2017-11-10 14:46     ` Igor Druzhinin
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2017-11-10 10:30 UTC (permalink / raw)
  To: SergeyDyasli
  Cc: Igor Druzhinin, Kevin Tian, Andrew Cooper, julien.grall@arm.com,
	raistlin@linux.it, jun.nakajima@intel.com,
	xen-devel@lists.xenproject.org

>>> On 10.11.17 at 09:41, <sergey.dyasli@citrix.com> wrote:
> On Thu, 2017-11-09 at 07:49 -0700, Jan Beulich wrote:
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>>       * we should disable PML manually here. Note that vmx_vcpu_destroy is called
>>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>>       * separately here.
>> +     *
>> +     * Before doing that though, flush all state for the vCPU previously having
>> +     * run on the current CPU, so that this flushing of state won't happen from
>> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
>> +     * vmx_vmcs_exit() section.
>>       */
>> +    sync_local_execstate();
>>      vmx_vcpu_disable_pml(v);
>>      vmx_destroy_vmcs(v);
>>      passive_domain_destroy(v);
> 
> This patch fixes only one particular issue and not the general problem.
> What if vmcs is cleared, possibly in some future code, at another place?

As indicated in the earlier discussion, if we go this route, other
future async accesses may need to do the same then.

> The original intent of vmx_vmcs_reload() is correct: it lazily loads
> the vmcs when it's needed. It's just the logic which checks for
> v->is_running inside vmx_ctxt_switch_from() is flawed: v might be
> "running" on another pCPU.
> 
> IMHO there are 2 possible solutions:
> 
>     1. Add additional pCPU check into vmx_ctxt_switch_from()

I agree with Dario in not seeing this as a possible solution.

>    2. Drop v->is_running check inside vmx_ctxt_switch_from() making
>        vmx_vmcs_reload() unconditional.

This is an option, indeed (and I don't think it would have a
meaningful performance impact, as vmx_vmcs_reload() does
nothing if the right VMCS is already in place). Iirc I had added the
conditional back then merely to introduce as little of a behavioral
change as was (appeared to be at that time) necessary. What I'm
not certain about, however, is the final state we'll end up in then.
Coming back to your flow scheme (altered to represent the
suggested new flow):

pCPU1                                   pCPU2
=====                                   =====
current == vCPU1
context_switch(next == idle)
!! __context_switch() is skipped
vcpu_migrate(vCPU1)
RCU callbacks
vmx_vcpu_destroy()
vmx_vcpu_disable_pml()
current_vmcs = 0

                                        schedule(next == vCPU1)
                                        vCPU1->is_running = 1;
                                        context_switch(next == vCPU1)
                                        flush_tlb_mask(&dirty_mask);

                                <--- IPI

__sync_local_execstate()
__context_switch(prev == vCPU1)
vmx_ctxt_switch_from(vCPU1)
vmx_vmcs_reload()
...

We'd now leave the being destroyed vCPU's VMCS active in pCPU1
(at least I can't see where it would be deactivated again).

Overall I think it is quite reasonable to terminate early a lazy
context switch of a vCPU under destruction. From that abstract
consideration, forcing this higher up the call stack of
vmx_vcpu_destroy() (as I had suggested as an alternative
previously, before actually moving it further down into VMX code,
perhaps even right in RCU handling) would continue to be an
option. In this context you may want to pay particular
attention to the description of 346da00456 ("Synchronise lazy
execstate before calling tasklet handlers").

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-10 10:30   ` Jan Beulich
@ 2017-11-10 14:46     ` Igor Druzhinin
  2017-11-13  9:51       ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Igor Druzhinin @ 2017-11-10 14:46 UTC (permalink / raw)
  To: Jan Beulich, SergeyDyasli
  Cc: Kevin Tian, Andrew Cooper, julien.grall@arm.com,
	raistlin@linux.it, jun.nakajima@intel.com,
	xen-devel@lists.xenproject.org

On 10/11/17 10:30, Jan Beulich wrote:
>>>> On 10.11.17 at 09:41, <sergey.dyasli@citrix.com> wrote:
>> On Thu, 2017-11-09 at 07:49 -0700, Jan Beulich wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>>>       * we should disable PML manually here. Note that vmx_vcpu_destroy is called
>>>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>>>       * separately here.
>>> +     *
>>> +     * Before doing that though, flush all state for the vCPU previously having
>>> +     * run on the current CPU, so that this flushing of state won't happen from
>>> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
>>> +     * vmx_vmcs_exit() section.
>>>       */
>>> +    sync_local_execstate();
>>>      vmx_vcpu_disable_pml(v);
>>>      vmx_destroy_vmcs(v);
>>>      passive_domain_destroy(v);
>>
>> This patch fixes only one particular issue and not the general problem.
>> What if vmcs is cleared, possibly in some future code, at another place?
> 
> As indicated in the earlier discussion, if we go this route, other
> future async accesses may need to do the same then.
> 
>> The original intent of vmx_vmcs_reload() is correct: it lazily loads
>> the vmcs when it's needed. It's just the logic which checks for
>> v->is_running inside vmx_ctxt_switch_from() is flawed: v might be
>> "running" on another pCPU.
>>
>> IMHO there are 2 possible solutions:
>>
>>     1. Add additional pCPU check into vmx_ctxt_switch_from()
> 
> I agree with Dario in not seeing this as a possible solution.
> 
>>    2. Drop v->is_running check inside vmx_ctxt_switch_from() making
>>        vmx_vmcs_reload() unconditional.
> 
> This is an option, indeed (and I don't think it would have a
> meaningful performance impact, as vmx_vmcs_reload() does
> nothing if the right VMCS is already in place). Iirc I had added the
> conditional back then merely to introduce as little of a behavioral
> change as was (appeared to be at that time) necessary. What I'm
> not certain about, however, is the final state we'll end up in then.
> Coming back to your flow scheme (altered to represent the
> suggested new flow):
> 

I was thinking of this approach for a while and I couldn't find anything
dangerous that can be potentially done by vmcs_reload() since it looks
like that it already has all the necessary checks inside.

> pCPU1                                   pCPU2
> =====                                   =====
> current == vCPU1
> context_switch(next == idle)
> !! __context_switch() is skipped
> vcpu_migrate(vCPU1)
> RCU callbacks
> vmx_vcpu_destroy()
> vmx_vcpu_disable_pml()
> current_vmcs = 0
> 
>                                         schedule(next == vCPU1)
>                                         vCPU1->is_running = 1;
>                                         context_switch(next == vCPU1)
>                                         flush_tlb_mask(&dirty_mask);
>                                     
>                                 <--- IPI
> 
> __sync_local_execstate()
> __context_switch(prev == vCPU1)
> vmx_ctxt_switch_from(vCPU1)
> vmx_vmcs_reload()
> ...
> 
> We'd now leave the being destroyed vCPU's VMCS active in pCPU1
> (at least I can't see where it would be deactivated again).

This would be VMCS of the migrated vCPU - not the destroyed one.

Igor

> Overall I think it is quite reasonable to terminate early a lazy
> context switch of a vCPU under destruction. From that abstract
> consideration, forcing this higher up the call stack of
> vmx_vcpu_destroy() (as I had suggested as an alternative
> previously, before actually moving it further down into VMX code,
> perhaps even right in RCU handling) would continue to be an
> option. In this context you may want to pay particular
> attention to the description of 346da00456 ("Synchronise lazy
> execstate before calling tasklet handlers").
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-10 14:46     ` Igor Druzhinin
@ 2017-11-13  9:51       ` Jan Beulich
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2017-11-13  9:51 UTC (permalink / raw)
  To: Igor Druzhinin, SergeyDyasli
  Cc: Kevin Tian, Andrew Cooper, julien.grall@arm.com,
	raistlin@linux.it, jun.nakajima@intel.com,
	xen-devel@lists.xenproject.org

>>> On 10.11.17 at 15:46, <igor.druzhinin@citrix.com> wrote:
> On 10/11/17 10:30, Jan Beulich wrote:
>>>>> On 10.11.17 at 09:41, <sergey.dyasli@citrix.com> wrote:
>>>    2. Drop v->is_running check inside vmx_ctxt_switch_from() making
>>>        vmx_vmcs_reload() unconditional.
>> 
>> This is an option, indeed (and I don't think it would have a
>> meaningful performance impact, as vmx_vmcs_reload() does
>> nothing if the right VMCS is already in place). Iirc I had added the
>> conditional back then merely to introduce as little of a behavioral
>> change as was (appeared to be at that time) necessary. What I'm
>> not certain about, however, is the final state we'll end up in then.
>> Coming back to your flow scheme (altered to represent the
>> suggested new flow):
>> 
> 
> I was thinking of this approach for a while and I couldn't find anything
> dangerous that can be potentially done by vmcs_reload() since it looks
> like that it already has all the necessary checks inside.
> 
>> pCPU1                                   pCPU2
>> =====                                   =====
>> current == vCPU1
>> context_switch(next == idle)
>> !! __context_switch() is skipped
>> vcpu_migrate(vCPU1)
>> RCU callbacks
>> vmx_vcpu_destroy()
>> vmx_vcpu_disable_pml()
>> current_vmcs = 0
>> 
>>                                         schedule(next == vCPU1)
>>                                         vCPU1->is_running = 1;
>>                                         context_switch(next == vCPU1)
>>                                         flush_tlb_mask(&dirty_mask);
>>                                     
>>                                 <--- IPI
>> 
>> __sync_local_execstate()
>> __context_switch(prev == vCPU1)
>> vmx_ctxt_switch_from(vCPU1)
>> vmx_vmcs_reload()
>> ...
>> 
>> We'd now leave the being destroyed vCPU's VMCS active in pCPU1
>> (at least I can't see where it would be deactivated again).
> 
> This would be VMCS of the migrated vCPU - not the destroyed one.

Oh, right. Nevertheless I favor the other approach (or some of the
possible variants of it). In particular, I'm increasingly in favor of
moving the sync up the call stack, at least into
complete_domain_destroy(), or even rcu_do_batch(), as mentioned
before. The former would be more clearly of no concern performance
wise (to please Dario), while the latter would be the obvious and
clean equivalent of the old tasklet commit I've pointed out last week.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-09 14:49 [PATCH] VMX: sync CPU state upon vCPU destruction Jan Beulich
  2017-11-09 15:02 ` Dario Faggioli
  2017-11-10  8:41 ` Sergey Dyasli
@ 2017-11-21 13:22 ` Jan Beulich
  2017-11-21 14:07   ` Igor Druzhinin
  2017-11-21 16:08   ` George Dunlap
  2017-11-21 16:26 ` Igor Druzhinin
  3 siblings, 2 replies; 17+ messages in thread
From: Jan Beulich @ 2017-11-21 13:22 UTC (permalink / raw)
  To: Igor Druzhinin, George Dunlap, Jun Nakajima, Kevin Tian
  Cc: xen-devel, Julien Grall, Dario Faggioli

>>> On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
> See the code comment being added for why we need this.
> 
> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I realize we aren't settled yet on where to put the sync call. The
discussion appears to have stalled, though. Just to recap,
alternatives to the placement below are
- at the top of complete_domain_destroy(), being the specific
  RCU callback exhibiting the problem (others are unlikely to
  touch guest state)
- in rcu_do_batch(), paralleling the similar call from
  do_tasklet_work()

Jan

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>       * we should disable PML manually here. Note that vmx_vcpu_destroy is called
>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>       * separately here.
> +     *
> +     * Before doing that though, flush all state for the vCPU previously having
> +     * run on the current CPU, so that this flushing of state won't happen from
> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
> +     * vmx_vmcs_exit() section.
>       */
> +    sync_local_execstate();
>      vmx_vcpu_disable_pml(v);
>      vmx_destroy_vmcs(v);
>      passive_domain_destroy(v);
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> https://lists.xen.org/xen-devel 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 13:22 ` Ping: " Jan Beulich
@ 2017-11-21 14:07   ` Igor Druzhinin
  2017-11-21 15:29     ` Jan Beulich
  2017-11-21 16:08   ` George Dunlap
  1 sibling, 1 reply; 17+ messages in thread
From: Igor Druzhinin @ 2017-11-21 14:07 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap, Jun Nakajima, Kevin Tian
  Cc: xen-devel, Julien Grall, Dario Faggioli

On 21/11/17 13:22, Jan Beulich wrote:
>>>> On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
>> See the code comment being added for why we need this.
>>
>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I realize we aren't settled yet on where to put the sync call. The
> discussion appears to have stalled, though. Just to recap,
> alternatives to the placement below are
> - at the top of complete_domain_destroy(), being the specific
>   RCU callback exhibiting the problem (others are unlikely to
>   touch guest state)
> - in rcu_do_batch(), paralleling the similar call from
>   do_tasklet_work()

rcu_do_batch() sounds better to me. As I said before I think that the
problem is general for the hypervisor (not for VMX only) and might
appear in other places as well.

Those choices that you outlined appear to be different in terms whether
we solve the general problem and probably have some minor performance
impact or we solve the ad-hoc problem but make the system more
entangled. Here I'm more inclined to the first choice because this
particular scenario the performance impact should be negligible.

Igor


> 
> Jan
> 
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>>       * we should disable PML manually here. Note that vmx_vcpu_destroy is called
>>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>>       * separately here.
>> +     *
>> +     * Before doing that though, flush all state for the vCPU previously having
>> +     * run on the current CPU, so that this flushing of state won't happen from
>> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
>> +     * vmx_vmcs_exit() section.
>>       */
>> +    sync_local_execstate();
>>      vmx_vcpu_disable_pml(v);
>>      vmx_destroy_vmcs(v);
>>      passive_domain_destroy(v);
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org 
>> https://lists.xen.org/xen-devel 
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 14:07   ` Igor Druzhinin
@ 2017-11-21 15:29     ` Jan Beulich
  2017-11-21 16:00       ` Igor Druzhinin
                         ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jan Beulich @ 2017-11-21 15:29 UTC (permalink / raw)
  To: Igor Druzhinin
  Cc: KevinTian, George Dunlap, Julien Grall, Dario Faggioli,
	Jun Nakajima, xen-devel

>>> On 21.11.17 at 15:07, <igor.druzhinin@citrix.com> wrote:
> On 21/11/17 13:22, Jan Beulich wrote:
>>>>> On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
>>> See the code comment being added for why we need this.
>>>
>>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> 
>> I realize we aren't settled yet on where to put the sync call. The
>> discussion appears to have stalled, though. Just to recap,
>> alternatives to the placement below are
>> - at the top of complete_domain_destroy(), being the specific
>>   RCU callback exhibiting the problem (others are unlikely to
>>   touch guest state)
>> - in rcu_do_batch(), paralleling the similar call from
>>   do_tasklet_work()
> 
> rcu_do_batch() sounds better to me. As I said before I think that the
> problem is general for the hypervisor (not for VMX only) and might
> appear in other places as well.

The question here is: In what other cases do we expect an RCU
callback to possibly touch guest state? I think the common use is
to merely free some memory in a delayed fashion.

> Those choices that you outlined appear to be different in terms whether
> we solve the general problem and probably have some minor performance
> impact or we solve the ad-hoc problem but make the system more
> entangled. Here I'm more inclined to the first choice because this
> particular scenario the performance impact should be negligible.

For the problem at hand there's no question about a
performance effect. The question is whether doing this for _other_
RCU callbacks would introduce a performance drop in certain cases.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 15:29     ` Jan Beulich
@ 2017-11-21 16:00       ` Igor Druzhinin
  2017-11-21 16:42       ` Dario Faggioli
  2017-11-21 17:00       ` Sergey Dyasli
  2 siblings, 0 replies; 17+ messages in thread
From: Igor Druzhinin @ 2017-11-21 16:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: KevinTian, George Dunlap, Julien Grall, Dario Faggioli,
	Jun Nakajima, xen-devel

On 21/11/17 15:29, Jan Beulich wrote:
>>>> On 21.11.17 at 15:07, <igor.druzhinin@citrix.com> wrote:
>> On 21/11/17 13:22, Jan Beulich wrote:
>>>>>> On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
>>>> See the code comment being added for why we need this.
>>>>
>>>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> I realize we aren't settled yet on where to put the sync call. The
>>> discussion appears to have stalled, though. Just to recap,
>>> alternatives to the placement below are
>>> - at the top of complete_domain_destroy(), being the specific
>>>   RCU callback exhibiting the problem (others are unlikely to
>>>   touch guest state)
>>> - in rcu_do_batch(), paralleling the similar call from
>>>   do_tasklet_work()
>>
>> rcu_do_batch() sounds better to me. As I said before I think that the
>> problem is general for the hypervisor (not for VMX only) and might
>> appear in other places as well.
> 
> The question here is: In what other cases do we expect an RCU
> callback to possibly touch guest state? I think the common use is
> to merely free some memory in a delayed fashion.
> 

I don't know for sure what the common scenario is for Xen but drawing
parallels between Linux - you're probably right.

>> Those choices that you outlined appear to be different in terms whether
>> we solve the general problem and probably have some minor performance
>> impact or we solve the ad-hoc problem but make the system more
>> entangled. Here I'm more inclined to the first choice because this
>> particular scenario the performance impact should be negligible.
> 
> For the problem at hand there's no question about a
> performance effect. The question is whether doing this for _other_
> RCU callbacks would introduce a performance drop in certain cases.
> 

Yes, right. In that case this placement would mean we are going to lose
the partial context each time we take RCU in idle, is this correct? If
so that sounds like a common scenario to me and means there will be some
performance degradation, although I don't know how common it really is.

Anyway, if you're in favor of the previous approach I have no objections
as my understanding of Xen codebase is still partial.

Igor


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 15:29     ` Jan Beulich
  2017-11-21 16:00       ` Igor Druzhinin
@ 2017-11-21 16:42       ` Dario Faggioli
  2017-11-21 16:58         ` George Dunlap
  2017-11-21 17:00       ` Sergey Dyasli
  2 siblings, 1 reply; 17+ messages in thread
From: Dario Faggioli @ 2017-11-21 16:42 UTC (permalink / raw)
  To: Jan Beulich, Igor Druzhinin
  Cc: George Dunlap, xen-devel, Julien Grall, KevinTian, Jun Nakajima

[-- Attachment #1.1: Type: text/plain, Size: 2394 bytes --]

On Tue, 2017-11-21 at 08:29 -0700, Jan Beulich wrote:
> > > > On 21.11.17 at 15:07, <igor.druzhinin@citrix.com> wrote:
> > 
> The question here is: In what other cases do we expect an RCU
> callback to possibly touch guest state? I think the common use is
> to merely free some memory in a delayed fashion.
> 
> > Those choices that you outlined appear to be different in terms
> > whether
> > we solve the general problem and probably have some minor
> > performance
> > impact or we solve the ad-hoc problem but make the system more
> > entangled. Here I'm more inclined to the first choice because this
> > particular scenario the performance impact should be negligible.
> 
> For the problem at hand there's no question about a
> performance effect. The question is whether doing this for _other_
> RCU callbacks would introduce a performance drop in certain cases.
> 
Well, I personally favour the approach of making the piece of code that
plays with the context responsible of not messing up when doing so.

And (replying to Igor comment above), I don't think that syncing
context before RCU handlers solves the general problem --as you're
calling it-- of "VMX code asynchronously messing up with the context". 
In fzct, it solves the specific problem of "VMX code called via RCU,
asynchronously messing up with the context".
There may be other places where (VMX?) code messes with context, *not*
from within an RCU handler, and that would still be an issue.

All that being said, given the nature of RCUs themselves, and given the
"precedent" we have for tasklets, I don't think it's a problem to sync
the state in rcu_do_batch().

Looking at users of call_rcu() (and trying to follow the call chains),
I think the only occasion where there may be an impact on perf, would
be when it's used in del_msixtbl_entry() (e.g., when that is called by
msixtbl_pt_unregister())... but I'm not familiar with that area of
code, so I may very well be wrong.

So, to summarize, if it were me doing this, I'd sync either in
vmx_vcpu_destroy() or in complete_domain_destroy(). But (for what it's
worth) I'm fine with it happening in rcu_do_batch().

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 16:42       ` Dario Faggioli
@ 2017-11-21 16:58         ` George Dunlap
  0 siblings, 0 replies; 17+ messages in thread
From: George Dunlap @ 2017-11-21 16:58 UTC (permalink / raw)
  To: Dario Faggioli, Jan Beulich, Igor Druzhinin
  Cc: George Dunlap, xen-devel, Julien Grall, KevinTian, Jun Nakajima

On 11/21/2017 04:42 PM, Dario Faggioli wrote:
> On Tue, 2017-11-21 at 08:29 -0700, Jan Beulich wrote:
>>>>> On 21.11.17 at 15:07, <igor.druzhinin@citrix.com> wrote:
>>>
>> The question here is: In what other cases do we expect an RCU
>> callback to possibly touch guest state? I think the common use is
>> to merely free some memory in a delayed fashion.
>>
>>> Those choices that you outlined appear to be different in terms
>>> whether
>>> we solve the general problem and probably have some minor
>>> performance
>>> impact or we solve the ad-hoc problem but make the system more
>>> entangled. Here I'm more inclined to the first choice because this
>>> particular scenario the performance impact should be negligible.
>>
>> For the problem at hand there's no question about a
>> performance effect. The question is whether doing this for _other_
>> RCU callbacks would introduce a performance drop in certain cases.
>>
> Well, I personally favour the approach of making the piece of code that
> plays with the context responsible of not messing up when doing so.
> 
> And (replying to Igor comment above), I don't think that syncing
> context before RCU handlers solves the general problem --as you're
> calling it-- of "VMX code asynchronously messing up with the context". 
> In fzct, it solves the specific problem of "VMX code called via RCU,
> asynchronously messing up with the context".
> There may be other places where (VMX?) code messes with context, *not*
> from within an RCU handler, and that would still be an issue.

Yes, to expand on what I said earlier: Given that we cannot (at least
between now and the release) make it so that developers *never* have to
think about syncing state, it seems like the best thing to do is to make
coders *always* think about syncing state.  Syncing always in the RCU
handler means coders can get away sometimes without syncing; which makes
it more likely we'll forget in some other circumstance where it matters.

But that's my take on general principles; like Dario I wouldn't argue
too strongly if someone felt differently.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 15:29     ` Jan Beulich
  2017-11-21 16:00       ` Igor Druzhinin
  2017-11-21 16:42       ` Dario Faggioli
@ 2017-11-21 17:00       ` Sergey Dyasli
  2017-11-21 17:26         ` Jan Beulich
  2 siblings, 1 reply; 17+ messages in thread
From: Sergey Dyasli @ 2017-11-21 17:00 UTC (permalink / raw)
  To: JBeulich@suse.com, Igor Druzhinin
  Cc: Sergey Dyasli, Kevin Tian, Andrew Cooper, George Dunlap,
	julien.grall@arm.com, raistlin@linux.it, jun.nakajima@intel.com,
	xen-devel@lists.xenproject.org

On Tue, 2017-11-21 at 08:29 -0700, Jan Beulich wrote:
> > > > On 21.11.17 at 15:07, <igor.druzhinin@citrix.com> wrote:
> > 
> > On 21/11/17 13:22, Jan Beulich wrote:
> > > > > > On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
> > > > 
> > > > See the code comment being added for why we need this.
> > > > 
> > > > Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> > > > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > > 
> > > I realize we aren't settled yet on where to put the sync call. The
> > > discussion appears to have stalled, though. Just to recap,
> > > alternatives to the placement below are
> > > - at the top of complete_domain_destroy(), being the specific
> > >   RCU callback exhibiting the problem (others are unlikely to
> > >   touch guest state)
> > > - in rcu_do_batch(), paralleling the similar call from
> > >   do_tasklet_work()
> > 
> > rcu_do_batch() sounds better to me. As I said before I think that the
> > problem is general for the hypervisor (not for VMX only) and might
> > appear in other places as well.
> 
> The question here is: In what other cases do we expect an RCU
> callback to possibly touch guest state? I think the common use is
> to merely free some memory in a delayed fashion.
> 
> > Those choices that you outlined appear to be different in terms whether
> > we solve the general problem and probably have some minor performance
> > impact or we solve the ad-hoc problem but make the system more
> > entangled. Here I'm more inclined to the first choice because this
> > particular scenario the performance impact should be negligible.
> 
> For the problem at hand there's no question about a
> performance effect. The question is whether doing this for _other_
> RCU callbacks would introduce a performance drop in certain cases.

So what are performance implications of my original suggestion of
removing !v->is_running check from vmx_ctxt_switch_from() ?
From what I can see:

1. Another field in struct vcpu will be checked instead (vmcs_pa)
2. Additionally this_cpu(current_vmcs) will be loaded, which shouldn't
   be terrible, given how heavy a context switch already is.

-- 
Thanks,
Sergey
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 17:00       ` Sergey Dyasli
@ 2017-11-21 17:26         ` Jan Beulich
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2017-11-21 17:26 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Igor Druzhinin, Kevin Tian, Andrew Cooper, George Dunlap,
	julien.grall@arm.com, raistlin@linux.it, jun.nakajima@intel.com,
	xen-devel@lists.xenproject.org

>>> On 21.11.17 at 18:00, <sergey.dyasli@citrix.com> wrote:
> On Tue, 2017-11-21 at 08:29 -0700, Jan Beulich wrote:
>> > > > On 21.11.17 at 15:07, <igor.druzhinin@citrix.com> wrote:
>> > 
>> > On 21/11/17 13:22, Jan Beulich wrote:
>> > > > > > On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
>> > > > 
>> > > > See the code comment being added for why we need this.
>> > > > 
>> > > > Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>> > > > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> > > 
>> > > I realize we aren't settled yet on where to put the sync call. The
>> > > discussion appears to have stalled, though. Just to recap,
>> > > alternatives to the placement below are
>> > > - at the top of complete_domain_destroy(), being the specific
>> > >   RCU callback exhibiting the problem (others are unlikely to
>> > >   touch guest state)
>> > > - in rcu_do_batch(), paralleling the similar call from
>> > >   do_tasklet_work()
>> > 
>> > rcu_do_batch() sounds better to me. As I said before I think that the
>> > problem is general for the hypervisor (not for VMX only) and might
>> > appear in other places as well.
>> 
>> The question here is: In what other cases do we expect an RCU
>> callback to possibly touch guest state? I think the common use is
>> to merely free some memory in a delayed fashion.
>> 
>> > Those choices that you outlined appear to be different in terms whether
>> > we solve the general problem and probably have some minor performance
>> > impact or we solve the ad-hoc problem but make the system more
>> > entangled. Here I'm more inclined to the first choice because this
>> > particular scenario the performance impact should be negligible.
>> 
>> For the problem at hand there's no question about a
>> performance effect. The question is whether doing this for _other_
>> RCU callbacks would introduce a performance drop in certain cases.
> 
> So what are performance implications of my original suggestion of
> removing !v->is_running check from vmx_ctxt_switch_from() ?
> From what I can see:
> 
> 1. Another field in struct vcpu will be checked instead (vmcs_pa)
> 2. Additionally this_cpu(current_vmcs) will be loaded, which shouldn't
>    be terrible, given how heavy a context switch already is.

There are no performance implications afaict; I'm simply of the
opinion this is not the way the issue should be addressed. The
sync approach seems much more natural to me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ping: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-21 13:22 ` Ping: " Jan Beulich
  2017-11-21 14:07   ` Igor Druzhinin
@ 2017-11-21 16:08   ` George Dunlap
  1 sibling, 0 replies; 17+ messages in thread
From: George Dunlap @ 2017-11-21 16:08 UTC (permalink / raw)
  To: Jan Beulich, Igor Druzhinin, George Dunlap, Jun Nakajima,
	Kevin Tian
  Cc: xen-devel, Julien Grall, Dario Faggioli

On 11/21/2017 01:22 PM, Jan Beulich wrote:
>>>> On 09.11.17 at 15:49, <JBeulich@suse.com> wrote:
>> See the code comment being added for why we need this.
>>
>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I realize we aren't settled yet on where to put the sync call. The
> discussion appears to have stalled, though. Just to recap,
> alternatives to the placement below are
> - at the top of complete_domain_destroy(), being the specific
>   RCU callback exhibiting the problem (others are unlikely to
>   touch guest state)
> - in rcu_do_batch(), paralleling the similar call from
>   do_tasklet_work()

I read through the discussion yesterday without digging into the code.
At the moment, I'd say that specific code needing to touch potentially
non-sync'd state should be marked to sync it, rather than syncing it all
the time.  But I don't have a strong opinion (particularly as I haven't
dug into the code).

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] VMX: sync CPU state upon vCPU destruction
  2017-11-09 14:49 [PATCH] VMX: sync CPU state upon vCPU destruction Jan Beulich
                   ` (2 preceding siblings ...)
  2017-11-21 13:22 ` Ping: " Jan Beulich
@ 2017-11-21 16:26 ` Igor Druzhinin
  3 siblings, 0 replies; 17+ messages in thread
From: Igor Druzhinin @ 2017-11-21 16:26 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Julien Grall, Dario Faggioli, Kevin Tian, Jun Nakajima

On 09/11/17 14:49, Jan Beulich wrote:
> See the code comment being added for why we need this.
> 
> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>       * we should disable PML manually here. Note that vmx_vcpu_destroy is called
>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>       * separately here.
> +     *
> +     * Before doing that though, flush all state for the vCPU previously having
> +     * run on the current CPU, so that this flushing of state won't happen from
> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
> +     * vmx_vmcs_exit() section.
>       */
> +    sync_local_execstate();
>      vmx_vcpu_disable_pml(v);
>      vmx_destroy_vmcs(v);
>      passive_domain_destroy(v);
> 

Reviewed-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-11-21 17:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-09 14:49 [PATCH] VMX: sync CPU state upon vCPU destruction Jan Beulich
2017-11-09 15:02 ` Dario Faggioli
2017-11-10  8:41 ` Sergey Dyasli
2017-11-10  9:50   ` Dario Faggioli
2017-11-10 10:30   ` Jan Beulich
2017-11-10 14:46     ` Igor Druzhinin
2017-11-13  9:51       ` Jan Beulich
2017-11-21 13:22 ` Ping: " Jan Beulich
2017-11-21 14:07   ` Igor Druzhinin
2017-11-21 15:29     ` Jan Beulich
2017-11-21 16:00       ` Igor Druzhinin
2017-11-21 16:42       ` Dario Faggioli
2017-11-21 16:58         ` George Dunlap
2017-11-21 17:00       ` Sergey Dyasli
2017-11-21 17:26         ` Jan Beulich
2017-11-21 16:08   ` George Dunlap
2017-11-21 16:26 ` Igor Druzhinin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).