xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found] ` <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org>
@ 2016-12-02 11:44   ` Andrew Cooper
       [not found]   ` <a9c41f3a-f649-712e-21bb-a849b0a4de13@citrix.com>
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Cooper @ 2016-12-02 11:44 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	linux-kernel@vger.kernel.org, Matthew Whitehead, Borislav Petkov,
	Henrique de Moraes Holschuh, Boris Ostrovsky, Xen-devel List

On 02/12/16 00:35, Andy Lutomirski wrote:
> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
> serialize on Xen PV, but, in practice, any trap it generates will
> serialize.)

Well, Xen will enabled CPUID Faulting wherever it can, which is
realistically all IvyBridge hardware and newer.

All hypercalls are a privilege change to cpl0.  I'd hope this condition
is serialising, but I can't actually find any documentation proving or
disproving this.

>
> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
> should end up being a nice speedup.
>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>

CC'ing xen-devel and the Xen maintainers in Linux.

As this is the only email from this series in my inbox, I will say this
here, but it should really be against patch 6.

A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
serialising on the 486, but I don't have a manual to hand to check.

~Andrew

> ---
>  arch/x86/xen/enlighten.c | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index bdd855685403..1f765b41eee7 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -311,6 +311,39 @@ static __read_mostly unsigned int cpuid_leaf1_ecx_set_mask;
>  static __read_mostly unsigned int cpuid_leaf5_ecx_val;
>  static __read_mostly unsigned int cpuid_leaf5_edx_val;
>  
> +static void xen_sync_core(void)
> +{
> +	register void *__sp asm(_ASM_SP);
> +
> +#ifdef CONFIG_X86_32
> +	asm volatile (
> +		"pushl %%ss\n\t"
> +		"pushl %%esp\n\t"
> +		"addl $4, (%%esp)\n\t"
> +		"pushfl\n\t"
> +		"pushl %%cs\n\t"
> +		"pushl $1f\n\t"
> +		"iret\n\t"
> +		"1:"
> +		: "+r" (__sp) : : "cc");
> +#else
> +	unsigned long tmp;
> +
> +	asm volatile (
> +		"movq %%ss, %0\n\t"
> +		"pushq %0\n\t"
> +		"pushq %%rsp\n\t"
> +		"addq $8, (%%rsp)\n\t"
> +		"pushfq\n\t"
> +		"movq %%cs, %0\n\t"
> +		"pushq %0\n\t"
> +		"pushq $1f\n\t"
> +		"iretq\n\t"
> +		"1:"
> +		: "=r" (tmp), "+r" (__sp) : : "cc");
> +#endif
> +}
> +
>  static void xen_cpuid(unsigned int *ax, unsigned int *bx,
>  		      unsigned int *cx, unsigned int *dx)
>  {
> @@ -1289,6 +1322,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>  
>  	.start_context_switch = paravirt_start_context_switch,
>  	.end_context_switch = xen_end_context_switch,
> +
> +	.sync_core = xen_sync_core,
>  };
>  
>  static void xen_reboot(int reason)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]   ` <a9c41f3a-f649-712e-21bb-a849b0a4de13@citrix.com>
@ 2016-12-02 17:07     ` Andy Lutomirski
       [not found]     ` <CALCETrV38BPzsOrhydgUqsQdQy9x2id2myQy+S3V3xUH9zJUdQ@mail.gmail.com>
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Andy Lutomirski @ 2016-12-02 17:07 UTC (permalink / raw)
  To: Andrew Cooper, Linus Torvalds
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	X86 ML, linux-kernel@vger.kernel.org, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Boris Ostrovsky,
	Xen-devel List

On Dec 2, 2016 3:44 AM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
> On 02/12/16 00:35, Andy Lutomirski wrote:
> > On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
> > guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
> > serialize on Xen PV, but, in practice, any trap it generates will
> > serialize.)
>
> Well, Xen will enabled CPUID Faulting wherever it can, which is
> realistically all IvyBridge hardware and newer.
>
> All hypercalls are a privilege change to cpl0.  I'd hope this condition
> is serialising, but I can't actually find any documentation proving or
> disproving this.

I don't know for sure.  IRET is serializing, and if Xen returns using
IRET, we're fine.

>
> >
> > On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
> > ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
> > should end up being a nice speedup.
> >
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Signed-off-by: Andy Lutomirski <luto@kernel.org>
>
> CC'ing xen-devel and the Xen maintainers in Linux.
>
> As this is the only email from this series in my inbox, I will say this
> here, but it should really be against patch 6.
>
> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
> serialising on the 486, but I don't have a manual to hand to check.

I'll quote the (modern) SDM.  For self-modifying code "The use of one
of these options is not required for programs intended to run on the
Pentium or Intel486 processors,
but are recommended to ensure compatibility with the P6 and more
recent processor families.".  For cross-modifying code "The use of
this option is not required for programs intended to run on the
Intel486 processor, but is recommended
to ensure compatibility with the Pentium 4, Intel Xeon, P6 family, and
Pentium processors."  So I'm not sure there's a problem.

I can add an unconditional jump just to make sure.  It costs basically
nothing on modern CPUs.  (Also, CPUID also isn't serializing on 486
according to the table.)

--Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]     ` <CALCETrV38BPzsOrhydgUqsQdQy9x2id2myQy+S3V3xUH9zJUdQ@mail.gmail.com>
@ 2016-12-02 17:16       ` Andrew Cooper
       [not found]       ` <e944fe00-e568-f258-6f42-5655b1424d7c@citrix.com>
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Cooper @ 2016-12-02 17:16 UTC (permalink / raw)
  To: Andy Lutomirski, Linus Torvalds
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	X86 ML, linux-kernel@vger.kernel.org, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Boris Ostrovsky,
	Xen-devel List

On 02/12/16 17:07, Andy Lutomirski wrote:
> On Dec 2, 2016 3:44 AM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>> On 02/12/16 00:35, Andy Lutomirski wrote:
>>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>>> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
>>> serialize on Xen PV, but, in practice, any trap it generates will
>>> serialize.)
>> Well, Xen will enabled CPUID Faulting wherever it can, which is
>> realistically all IvyBridge hardware and newer.
>>
>> All hypercalls are a privilege change to cpl0.  I'd hope this condition
>> is serialising, but I can't actually find any documentation proving or
>> disproving this.
> I don't know for sure.  IRET is serializing, and if Xen returns using
> IRET, we're fine.

All returns to a 64bit PV guest at defined points (hypercall return,
exception entry, etc) are from SYSRET, not IRET.

Talking of, I still have a patch to remove
PARAVIRT_ADJUST_EXCEPTION_FRAME which I need to complete and send upstream.

>
>>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>>> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
>>> should end up being a nice speedup.
>>>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> CC'ing xen-devel and the Xen maintainers in Linux.
>>
>> As this is the only email from this series in my inbox, I will say this
>> here, but it should really be against patch 6.
>>
>> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
>> serialising on the 486, but I don't have a manual to hand to check.
> I'll quote the (modern) SDM.  For self-modifying code "The use of one
> of these options is not required for programs intended to run on the
> Pentium or Intel486 processors,
> but are recommended to ensure compatibility with the P6 and more
> recent processor families.".  For cross-modifying code "The use of
> this option is not required for programs intended to run on the
> Intel486 processor, but is recommended
> to ensure compatibility with the Pentium 4, Intel Xeon, P6 family, and
> Pentium processors."  So I'm not sure there's a problem.

Fair enough.  (Assuming similar properties hold for the older processors
of other vendors.)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]       ` <e944fe00-e568-f258-6f42-5655b1424d7c@citrix.com>
@ 2016-12-02 17:23         ` Andy Lutomirski
       [not found]         ` <CALCETrXO5XsujwLaNTt_U7UF4MDDMwRDPCbGFLn4s7DyWEDtWQ@mail.gmail.com>
  1 sibling, 0 replies; 8+ messages in thread
From: Andy Lutomirski @ 2016-12-02 17:23 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	X86 ML, linux-kernel@vger.kernel.org, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Boris Ostrovsky,
	Xen-devel List, Linus Torvalds

On Fri, Dec 2, 2016 at 9:16 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 02/12/16 17:07, Andy Lutomirski wrote:
>> On Dec 2, 2016 3:44 AM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>>> On 02/12/16 00:35, Andy Lutomirski wrote:
>>>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>>>> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
>>>> serialize on Xen PV, but, in practice, any trap it generates will
>>>> serialize.)
>>> Well, Xen will enabled CPUID Faulting wherever it can, which is
>>> realistically all IvyBridge hardware and newer.
>>>
>>> All hypercalls are a privilege change to cpl0.  I'd hope this condition
>>> is serialising, but I can't actually find any documentation proving or
>>> disproving this.
>> I don't know for sure.  IRET is serializing, and if Xen returns using
>> IRET, we're fine.
>
> All returns to a 64bit PV guest at defined points (hypercall return,
> exception entry, etc) are from SYSRET, not IRET.

But CPUID faulting isn't like this, right?  Unless Xen does
opportunistic SYSRET.

>
> Talking of, I still have a patch to remove
> PARAVIRT_ADJUST_EXCEPTION_FRAME which I need to complete and send upstream.
>
>>
>>>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>>>> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
>>>> should end up being a nice speedup.
>>>>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>>> CC'ing xen-devel and the Xen maintainers in Linux.
>>>
>>> As this is the only email from this series in my inbox, I will say this
>>> here, but it should really be against patch 6.
>>>
>>> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
>>> serialising on the 486, but I don't have a manual to hand to check.
>> I'll quote the (modern) SDM.  For self-modifying code "The use of one
>> of these options is not required for programs intended to run on the
>> Pentium or Intel486 processors,
>> but are recommended to ensure compatibility with the P6 and more
>> recent processor families.".  For cross-modifying code "The use of
>> this option is not required for programs intended to run on the
>> Intel486 processor, but is recommended
>> to ensure compatibility with the Pentium 4, Intel Xeon, P6 family, and
>> Pentium processors."  So I'm not sure there's a problem.
>
> Fair enough.  (Assuming similar properties hold for the older processors
> of other vendors.)

No, you were right -- a different section of the SDM contradicts it:

For Intel486 processors, a write to an instruction in the cache will
modify it in both the cache and memory, but if
the instruction was prefetched before the write, the old version of
the instruction could be the one executed. To
prevent the old instruction from being executed, flush the instruction
prefetch unit by coding a jump instruction
immediately after any write that modifies an instruction.

--Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]         ` <CALCETrXO5XsujwLaNTt_U7UF4MDDMwRDPCbGFLn4s7DyWEDtWQ@mail.gmail.com>
@ 2016-12-02 17:26           ` Andrew Cooper
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Cooper @ 2016-12-02 17:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	X86 ML, linux-kernel@vger.kernel.org, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Boris Ostrovsky,
	Xen-devel List, Linus Torvalds

On 02/12/16 17:23, Andy Lutomirski wrote:
> On Fri, Dec 2, 2016 at 9:16 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 02/12/16 17:07, Andy Lutomirski wrote:
>>> On Dec 2, 2016 3:44 AM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>>>> On 02/12/16 00:35, Andy Lutomirski wrote:
>>>>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>>>>> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
>>>>> serialize on Xen PV, but, in practice, any trap it generates will
>>>>> serialize.)
>>>> Well, Xen will enabled CPUID Faulting wherever it can, which is
>>>> realistically all IvyBridge hardware and newer.
>>>>
>>>> All hypercalls are a privilege change to cpl0.  I'd hope this condition
>>>> is serialising, but I can't actually find any documentation proving or
>>>> disproving this.
>>> I don't know for sure.  IRET is serializing, and if Xen returns using
>>> IRET, we're fine.
>> All returns to a 64bit PV guest at defined points (hypercall return,
>> exception entry, etc) are from SYSRET, not IRET.
> But CPUID faulting isn't like this, right?  Unless Xen does
> opportunistic SYSRET.

Correct.  Xen doesn't do opportunistic SYSRET.

>
>> Talking of, I still have a patch to remove
>> PARAVIRT_ADJUST_EXCEPTION_FRAME which I need to complete and send upstream.
>>
>>>>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>>>>> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
>>>>> should end up being a nice speedup.
>>>>>
>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>>>> CC'ing xen-devel and the Xen maintainers in Linux.
>>>>
>>>> As this is the only email from this series in my inbox, I will say this
>>>> here, but it should really be against patch 6.
>>>>
>>>> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
>>>> serialising on the 486, but I don't have a manual to hand to check.
>>> I'll quote the (modern) SDM.  For self-modifying code "The use of one
>>> of these options is not required for programs intended to run on the
>>> Pentium or Intel486 processors,
>>> but are recommended to ensure compatibility with the P6 and more
>>> recent processor families.".  For cross-modifying code "The use of
>>> this option is not required for programs intended to run on the
>>> Intel486 processor, but is recommended
>>> to ensure compatibility with the Pentium 4, Intel Xeon, P6 family, and
>>> Pentium processors."  So I'm not sure there's a problem.
>> Fair enough.  (Assuming similar properties hold for the older processors
>> of other vendors.)
> No, you were right -- a different section of the SDM contradicts it:
>
> For Intel486 processors, a write to an instruction in the cache will
> modify it in both the cache and memory, but if
> the instruction was prefetched before the write, the old version of
> the instruction could be the one executed. To
> prevent the old instruction from being executed, flush the instruction
> prefetch unit by coding a jump instruction
> immediately after any write that modifies an instruction.

:(

Presumably this means patching has been subtly broken forever on the 486?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]   ` <a9c41f3a-f649-712e-21bb-a849b0a4de13@citrix.com>
  2016-12-02 17:07     ` Andy Lutomirski
       [not found]     ` <CALCETrV38BPzsOrhydgUqsQdQy9x2id2myQy+S3V3xUH9zJUdQ@mail.gmail.com>
@ 2016-12-02 18:50     ` Boris Ostrovsky
       [not found]     ` <402ae08c-22d3-bec7-6649-26632c941a29@oracle.com>
  2016-12-02 20:09     ` Boris Ostrovsky
  4 siblings, 0 replies; 8+ messages in thread
From: Boris Ostrovsky @ 2016-12-02 18:50 UTC (permalink / raw)
  To: Andrew Cooper, Andy Lutomirski, x86
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	linux-kernel@vger.kernel.org, Matthew Whitehead, Borislav Petkov,
	Henrique de Moraes Holschuh, Xen-devel List

On 12/02/2016 06:44 AM, Andrew Cooper wrote:
> On 02/12/16 00:35, Andy Lutomirski wrote:
>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
>> serialize on Xen PV, but, in practice, any trap it generates will
>> serialize.)
> Well, Xen will enabled CPUID Faulting wherever it can, which is
> realistically all IvyBridge hardware and newer.
>
> All hypercalls are a privilege change to cpl0.  I'd hope this condition
> is serialising, but I can't actually find any documentation proving or
> disproving this.
>
>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
>> should end up being a nice speedup.
>>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> CC'ing xen-devel and the Xen maintainers in Linux.
>
> As this is the only email from this series in my inbox, I will say this
> here, but it should really be against patch 6.
>
> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
> serialising on the 486, but I don't have a manual to hand to check.
>
> ~Andrew
>
>> ---
>>  arch/x86/xen/enlighten.c | 35 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index bdd855685403..1f765b41eee7 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -311,6 +311,39 @@ static __read_mostly unsigned int cpuid_leaf1_ecx_set_mask;
>>  static __read_mostly unsigned int cpuid_leaf5_ecx_val;
>>  static __read_mostly unsigned int cpuid_leaf5_edx_val;
>>  
>> +static void xen_sync_core(void)
>> +{
>> +	register void *__sp asm(_ASM_SP);
>> +
>> +#ifdef CONFIG_X86_32
>> +	asm volatile (
>> +		"pushl %%ss\n\t"
>> +		"pushl %%esp\n\t"
>> +		"addl $4, (%%esp)\n\t"
>> +		"pushfl\n\t"
>> +		"pushl %%cs\n\t"
>> +		"pushl $1f\n\t"
>> +		"iret\n\t"
>> +		"1:"
>> +		: "+r" (__sp) : : "cc");

This breaks 32-bit PV guests.

Why are we pushing %ss? We are not changing privilege levels so why not
just flags, cs and eip (which, incidentally, does work)?

-boris

>> +#else
>> +	unsigned long tmp;
>> +
>> +	asm volatile (
>> +		"movq %%ss, %0\n\t"
>> +		"pushq %0\n\t"
>> +		"pushq %%rsp\n\t"
>> +		"addq $8, (%%rsp)\n\t"
>> +		"pushfq\n\t"
>> +		"movq %%cs, %0\n\t"
>> +		"pushq %0\n\t"
>> +		"pushq $1f\n\t"
>> +		"iretq\n\t"
>> +		"1:"
>> +		: "=r" (tmp), "+r" (__sp) : : "cc");
>> +#endif
>> +}
>> +
>>  static void xen_cpuid(unsigned int *ax, unsigned int *bx,
>>  		      unsigned int *cx, unsigned int *dx)
>>  {
>> @@ -1289,6 +1322,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>>  
>>  	.start_context_switch = paravirt_start_context_switch,
>>  	.end_context_switch = xen_end_context_switch,
>> +
>> +	.sync_core = xen_sync_core,
>>  };
>>  
>>  static void xen_reboot(int reason)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]     ` <402ae08c-22d3-bec7-6649-26632c941a29@oracle.com>
@ 2016-12-02 19:34       ` Andy Lutomirski
  0 siblings, 0 replies; 8+ messages in thread
From: Andy Lutomirski @ 2016-12-02 19:34 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Andrew Cooper,
	X86 ML, linux-kernel@vger.kernel.org, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Brian Gerst,
	Xen-devel List

On Dec 2, 2016 10:48 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>
> On 12/02/2016 06:44 AM, Andrew Cooper wrote:
> > On 02/12/16 00:35, Andy Lutomirski wrote:
> >> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
> >> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
> >> serialize on Xen PV, but, in practice, any trap it generates will
> >> serialize.)
> > Well, Xen will enabled CPUID Faulting wherever it can, which is
> > realistically all IvyBridge hardware and newer.
> >
> > All hypercalls are a privilege change to cpl0.  I'd hope this condition
> > is serialising, but I can't actually find any documentation proving or
> > disproving this.
> >
> >> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
> >> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
> >> should end up being a nice speedup.
> >>
> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> > CC'ing xen-devel and the Xen maintainers in Linux.
> >
> > As this is the only email from this series in my inbox, I will say this
> > here, but it should really be against patch 6.
> >
> > A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
> > serialising on the 486, but I don't have a manual to hand to check.
> >
> > ~Andrew
> >
> >> ---
> >>  arch/x86/xen/enlighten.c | 35 +++++++++++++++++++++++++++++++++++
> >>  1 file changed, 35 insertions(+)
> >>
> >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> >> index bdd855685403..1f765b41eee7 100644
> >> --- a/arch/x86/xen/enlighten.c
> >> +++ b/arch/x86/xen/enlighten.c
> >> @@ -311,6 +311,39 @@ static __read_mostly unsigned int cpuid_leaf1_ecx_set_mask;
> >>  static __read_mostly unsigned int cpuid_leaf5_ecx_val;
> >>  static __read_mostly unsigned int cpuid_leaf5_edx_val;
> >>
> >> +static void xen_sync_core(void)
> >> +{
> >> +    register void *__sp asm(_ASM_SP);
> >> +
> >> +#ifdef CONFIG_X86_32
> >> +    asm volatile (
> >> +            "pushl %%ss\n\t"
> >> +            "pushl %%esp\n\t"
> >> +            "addl $4, (%%esp)\n\t"
> >> +            "pushfl\n\t"
> >> +            "pushl %%cs\n\t"
> >> +            "pushl $1f\n\t"
> >> +            "iret\n\t"
> >> +            "1:"
> >> +            : "+r" (__sp) : : "cc");
>
> This breaks 32-bit PV guests.
>
> Why are we pushing %ss? We are not changing privilege levels so why not
> just flags, cs and eip (which, incidentally, does work)?
>

Doh!  I carefully tested 64-bit on Xen and 32-bit in user mode.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
       [not found]   ` <a9c41f3a-f649-712e-21bb-a849b0a4de13@citrix.com>
                       ` (3 preceding siblings ...)
       [not found]     ` <402ae08c-22d3-bec7-6649-26632c941a29@oracle.com>
@ 2016-12-02 20:09     ` Boris Ostrovsky
  4 siblings, 0 replies; 8+ messages in thread
From: Boris Ostrovsky @ 2016-12-02 20:09 UTC (permalink / raw)
  To: Andrew Cooper, Andy Lutomirski, x86
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	linux-kernel@vger.kernel.org, Matthew Whitehead, Borislav Petkov,
	Henrique de Moraes Holschuh, Xen-devel List

On 12/02/2016 06:44 AM, Andrew Cooper wrote:
> On 02/12/16 00:35, Andy Lutomirski wrote:
>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>> guaranteed to serialize.  (Even CPUID isn't *really* guaranteed to
>> serialize on Xen PV, but, in practice, any trap it generates will
>> serialize.)
> Well, Xen will enabled CPUID Faulting wherever it can, which is
> realistically all IvyBridge hardware and newer.
>
> All hypercalls are a privilege change to cpl0.  I'd hope this condition
> is serialising, but I can't actually find any documentation proving or
> disproving this.
>
>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
>> should end up being a nice speedup.
>>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>

Executing CPUID in an HVM guest is quite expensive since it will cause a
VMEXIT. (And that should be true for any hypervisor, at least on Intel.
On AMD it's configurable)

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-12-02 20:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <cover.1480638597.git.luto@kernel.org>
     [not found] ` <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org>
2016-12-02 11:44   ` [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation Andrew Cooper
     [not found]   ` <a9c41f3a-f649-712e-21bb-a849b0a4de13@citrix.com>
2016-12-02 17:07     ` Andy Lutomirski
     [not found]     ` <CALCETrV38BPzsOrhydgUqsQdQy9x2id2myQy+S3V3xUH9zJUdQ@mail.gmail.com>
2016-12-02 17:16       ` Andrew Cooper
     [not found]       ` <e944fe00-e568-f258-6f42-5655b1424d7c@citrix.com>
2016-12-02 17:23         ` Andy Lutomirski
     [not found]         ` <CALCETrXO5XsujwLaNTt_U7UF4MDDMwRDPCbGFLn4s7DyWEDtWQ@mail.gmail.com>
2016-12-02 17:26           ` Andrew Cooper
2016-12-02 18:50     ` Boris Ostrovsky
     [not found]     ` <402ae08c-22d3-bec7-6649-26632c941a29@oracle.com>
2016-12-02 19:34       ` Andy Lutomirski
2016-12-02 20:09     ` Boris Ostrovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).