xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
@ 2017-05-19 18:49 Andrew Cooper
  2017-05-19 18:49 ` [PATCH for-next 2/2] xen/x86: Drop sync_core() Andrew Cooper
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Andrew Cooper @ 2017-05-19 18:49 UTC (permalink / raw)
  To: Xen-devel; +Cc: Wei Liu, Jan Beulich

From: Borislav Petkov <bp@suse.de>

We use sync_core() in the alternatives code to stop speculative
execution of prefetched instructions because we are potentially changing
them and don't want to execute stale bytes.

What it does on most machines is call CPUID which is a serializing
instruction. And that's expensive.

However, the instruction cache is serialized when we're on the local CPU
and are changing the data through the same virtual address. So then, we
don't need the serializing CPUID but a simple control flow change. Last
being accomplished with a CALL/RET which the noinline causes.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
[Linux commit 34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466]

Ported to Xen.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/alternative.c | 11 ++++++-----
 xen/arch/x86/livepatch.c   | 20 +++++++++++++++-----
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 6eaa10f..65062c2 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -128,13 +128,14 @@ void init_or_livepatch add_nops(void *insns, unsigned int len)
  *
  * You should run this with interrupts disabled or on code that is not
  * executing.
+ *
+ * "noinline" to cause control flow change and thus invalidate I$ and
+ * cause refetch after modification.
  */
-static void *init_or_livepatch text_poke(void *addr, const void *opcode, size_t len)
+static void *init_or_livepatch noinline
+text_poke(void *addr, const void *opcode, size_t len)
 {
-    memcpy(addr, opcode, len);
-    sync_core();
-
-    return addr;
+    return memcpy(addr, opcode, len);
 }
 
 /*
diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
index 9663ef6..dd50dd1 100644
--- a/xen/arch/x86/livepatch.c
+++ b/xen/arch/x86/livepatch.c
@@ -46,7 +46,11 @@ int arch_livepatch_verify_func(const struct livepatch_func *func)
     return 0;
 }
 
-void arch_livepatch_apply(struct livepatch_func *func)
+/*
+ * "noinline" to cause control flow change and thus invalidate I$ and
+ * cause refetch after modification.
+ */
+void noinline arch_livepatch_apply(struct livepatch_func *func)
 {
     uint8_t *old_ptr;
     uint8_t insn[sizeof(func->opaque)];
@@ -75,15 +79,21 @@ void arch_livepatch_apply(struct livepatch_func *func)
     memcpy(old_ptr, insn, len);
 }
 
-void arch_livepatch_revert(const struct livepatch_func *func)
+/*
+ * "noinline" to cause control flow change and thus invalidate I$ and
+ * cause refetch after modification.
+ */
+void noinline arch_livepatch_revert(const struct livepatch_func *func)
 {
     memcpy(func->old_addr, func->opaque, livepatch_insn_len(func));
 }
 
-/* Serialise the CPU pipeline. */
-void arch_livepatch_post_action(void)
+/*
+ * "noinline" to cause control flow change and thus invalidate I$ and
+ * cause refetch after modification.
+ */
+void noinline arch_livepatch_post_action(void)
 {
-    cpuid_eax(0);
 }
 
 static nmi_callback_t *saved_nmi_callback;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH for-next 2/2] xen/x86: Drop sync_core()
  2017-05-19 18:49 [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Andrew Cooper
@ 2017-05-19 18:49 ` Andrew Cooper
  2017-05-22 13:39   ` Jan Beulich
  2017-05-20  0:29 ` [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Konrad Rzeszutek Wilk
  2017-05-22 13:38 ` Jan Beulich
  2 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2017-05-19 18:49 UTC (permalink / raw)
  To: Xen-devel; +Cc: Wei Liu, Jan Beulich

As identified in Linux c/s c198b121b1a1d "x86/asm: Rewrite sync_core() to use
IRET-to-self", sync_core() is only appropriate for two very specific usecases.

Xen doesn't have need of either of these usecases, so drop sync_core() to
avoid any misuse.

In the unlikely event that we do gain a legitimate use for sync_core(), it
should be reintroduced as a mov to %cr2 rather than cpuid, which has a lower
overhead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/include/asm-x86/processor.h | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 1d1a4ff..6a335d3 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -396,17 +396,6 @@ static inline bool_t read_pkru_wd(uint32_t pkru, unsigned int pkey)
     outb((data), 0x23); \
 } while (0)
 
-/* Stop speculative execution */
-static inline void sync_core(void)
-{
-    int tmp;
-    asm volatile (
-        "cpuid"
-        : "=a" (tmp)
-        : "0" (1)
-        : "ebx","ecx","edx","memory" );
-}
-
 static always_inline void __monitor(const void *eax, unsigned long ecx,
                                     unsigned long edx)
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
  2017-05-19 18:49 [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Andrew Cooper
  2017-05-19 18:49 ` [PATCH for-next 2/2] xen/x86: Drop sync_core() Andrew Cooper
@ 2017-05-20  0:29 ` Konrad Rzeszutek Wilk
  2017-05-22 13:38 ` Jan Beulich
  2 siblings, 0 replies; 9+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-05-20  0:29 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Jan Beulich, Xen-devel

On Fri, May 19, 2017 at 07:49:06PM +0100, Andrew Cooper wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> We use sync_core() in the alternatives code to stop speculative
> execution of prefetched instructions because we are potentially changing
> them and don't want to execute stale bytes.
> 
> What it does on most machines is call CPUID which is a serializing
> instruction. And that's expensive.
> 
> However, the instruction cache is serialized when we're on the local CPU
> and are changing the data through the same virtual address. So then, we
> don't need the serializing CPUID but a simple control flow change. Last
> being accomplished with a CALL/RET which the noinline causes.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Reviewed-by: Andy Lutomirski <luto@kernel.org>
> [Linux commit 34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466]
> 
> Ported to Xen.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> ---
>  xen/arch/x86/alternative.c | 11 ++++++-----
>  xen/arch/x86/livepatch.c   | 20 +++++++++++++++-----

Weird that the CC list didn't pick me up.

But Acked-by on the livepatch part.

>  2 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
> index 6eaa10f..65062c2 100644
> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -128,13 +128,14 @@ void init_or_livepatch add_nops(void *insns, unsigned int len)
>   *
>   * You should run this with interrupts disabled or on code that is not
>   * executing.
> + *
> + * "noinline" to cause control flow change and thus invalidate I$ and
> + * cause refetch after modification.
>   */
> -static void *init_or_livepatch text_poke(void *addr, const void *opcode, size_t len)
> +static void *init_or_livepatch noinline
> +text_poke(void *addr, const void *opcode, size_t len)
>  {
> -    memcpy(addr, opcode, len);
> -    sync_core();
> -
> -    return addr;
> +    return memcpy(addr, opcode, len);
>  }
>  
>  /*
> diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
> index 9663ef6..dd50dd1 100644
> --- a/xen/arch/x86/livepatch.c
> +++ b/xen/arch/x86/livepatch.c
> @@ -46,7 +46,11 @@ int arch_livepatch_verify_func(const struct livepatch_func *func)
>      return 0;
>  }
>  
> -void arch_livepatch_apply(struct livepatch_func *func)
> +/*
> + * "noinline" to cause control flow change and thus invalidate I$ and
> + * cause refetch after modification.
> + */
> +void noinline arch_livepatch_apply(struct livepatch_func *func)
>  {
>      uint8_t *old_ptr;
>      uint8_t insn[sizeof(func->opaque)];
> @@ -75,15 +79,21 @@ void arch_livepatch_apply(struct livepatch_func *func)
>      memcpy(old_ptr, insn, len);
>  }
>  
> -void arch_livepatch_revert(const struct livepatch_func *func)
> +/*
> + * "noinline" to cause control flow change and thus invalidate I$ and
> + * cause refetch after modification.
> + */
> +void noinline arch_livepatch_revert(const struct livepatch_func *func)
>  {
>      memcpy(func->old_addr, func->opaque, livepatch_insn_len(func));
>  }
>  
> -/* Serialise the CPU pipeline. */
> -void arch_livepatch_post_action(void)
> +/*
> + * "noinline" to cause control flow change and thus invalidate I$ and
> + * cause refetch after modification.
> + */
> +void noinline arch_livepatch_post_action(void)
>  {
> -    cpuid_eax(0);
>  }
>  
>  static nmi_callback_t *saved_nmi_callback;
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
  2017-05-19 18:49 [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Andrew Cooper
  2017-05-19 18:49 ` [PATCH for-next 2/2] xen/x86: Drop sync_core() Andrew Cooper
  2017-05-20  0:29 ` [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Konrad Rzeszutek Wilk
@ 2017-05-22 13:38 ` Jan Beulich
  2017-05-22 13:51   ` Andrew Cooper
  2 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2017-05-22 13:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Xen-devel

>>> On 19.05.17 at 20:49, <andrew.cooper3@citrix.com> wrote:
> We use sync_core() in the alternatives code to stop speculative
> execution of prefetched instructions because we are potentially changing
> them and don't want to execute stale bytes.
> 
> What it does on most machines is call CPUID which is a serializing
> instruction. And that's expensive.
> 
> However, the instruction cache is serialized when we're on the local CPU
> and are changing the data through the same virtual address.

Do you have the background of this "same virtual address"
constraint? Caches are physically indexed, so I don't see the
connection. Yet if there is one, our stub generation in the
emulator may have an issue.

> So then, we
> don't need the serializing CPUID but a simple control flow change. Last
> being accomplished with a CALL/RET which the noinline causes.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Reviewed-by: Andy Lutomirski <luto@kernel.org>
> [Linux commit 34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466]
> 
> Ported to Xen.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with a question:

> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -128,13 +128,14 @@ void init_or_livepatch add_nops(void *insns, unsigned int len)
>   *
>   * You should run this with interrupts disabled or on code that is not
>   * executing.
> + *
> + * "noinline" to cause control flow change and thus invalidate I$ and
> + * cause refetch after modification.
>   */
> -static void *init_or_livepatch text_poke(void *addr, const void *opcode, 
> size_t len)
> +static void *init_or_livepatch noinline
> +text_poke(void *addr, const void *opcode, size_t len)
>  {
> -    memcpy(addr, opcode, len);
> -    sync_core();
> -
> -    return addr;
> +    return memcpy(addr, opcode, len);
>  }

What if this is patching memcpy() itself?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 2/2] xen/x86: Drop sync_core()
  2017-05-19 18:49 ` [PATCH for-next 2/2] xen/x86: Drop sync_core() Andrew Cooper
@ 2017-05-22 13:39   ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2017-05-22 13:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Xen-devel

>>> On 19.05.17 at 20:49, <andrew.cooper3@citrix.com> wrote:
> As identified in Linux c/s c198b121b1a1d "x86/asm: Rewrite sync_core() to use
> IRET-to-self", sync_core() is only appropriate for two very specific usecases.
> 
> Xen doesn't have need of either of these usecases, so drop sync_core() to
> avoid any misuse.
> 
> In the unlikely event that we do gain a legitimate use for sync_core(), it
> should be reintroduced as a mov to %cr2 rather than cpuid, which has a lower
> overhead.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
  2017-05-22 13:38 ` Jan Beulich
@ 2017-05-22 13:51   ` Andrew Cooper
  2017-05-22 14:17     ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2017-05-22 13:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Xen-devel

On 22/05/17 14:38, Jan Beulich wrote:
>>>> On 19.05.17 at 20:49, <andrew.cooper3@citrix.com> wrote:
>> We use sync_core() in the alternatives code to stop speculative
>> execution of prefetched instructions because we are potentially changing
>> them and don't want to execute stale bytes.
>>
>> What it does on most machines is call CPUID which is a serializing
>> instruction. And that's expensive.
>>
>> However, the instruction cache is serialized when we're on the local CPU
>> and are changing the data through the same virtual address.
> Do you have the background of this "same virtual address"
> constraint?

There was a long LKML thread on the subject. 
https://lkml.org/lkml/2016/12/3/108

> Caches are physically indexed, so I don't see the
> connection. Yet if there is one, our stub generation in the
> emulator may have an issue.

I think https://lkml.org/lkml/2016/12/2/454 is probably the relevant
statement.

>
>> So then, we
>> don't need the serializing CPUID but a simple control flow change. Last
>> being accomplished with a CALL/RET which the noinline causes.
>>
>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
>> Signed-off-by: Borislav Petkov <bp@suse.de>
>> Reviewed-by: Andy Lutomirski <luto@kernel.org>
>> [Linux commit 34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466]
>>
>> Ported to Xen.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> with a question:
>
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -128,13 +128,14 @@ void init_or_livepatch add_nops(void *insns, unsigned int len)
>>   *
>>   * You should run this with interrupts disabled or on code that is not
>>   * executing.
>> + *
>> + * "noinline" to cause control flow change and thus invalidate I$ and
>> + * cause refetch after modification.
>>   */
>> -static void *init_or_livepatch text_poke(void *addr, const void *opcode, 
>> size_t len)
>> +static void *init_or_livepatch noinline
>> +text_poke(void *addr, const void *opcode, size_t len)
>>  {
>> -    memcpy(addr, opcode, len);
>> -    sync_core();
>> -
>> -    return addr;
>> +    return memcpy(addr, opcode, len);
>>  }
> What if this is patching memcpy() itself?

That would be fine, because its still the same linear addresses, and
would still be synchronised.

In reality for livepatching, we only modify the 5 byte prefix at the
head of the functions, so this memcpy() wouldn't actually hit itself.

Of course, you could construct a patch which would cause memcpy() to
actually hit itself, but that isn't going to work at all, irrespective
of serialising instructions.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
  2017-05-22 13:51   ` Andrew Cooper
@ 2017-05-22 14:17     ` Jan Beulich
  2017-05-22 14:26       ` Juergen Gross
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2017-05-22 14:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Xen-devel

>>> On 22.05.17 at 15:51, <andrew.cooper3@citrix.com> wrote:
> On 22/05/17 14:38, Jan Beulich wrote:
>>>>> On 19.05.17 at 20:49, <andrew.cooper3@citrix.com> wrote:
>>> We use sync_core() in the alternatives code to stop speculative
>>> execution of prefetched instructions because we are potentially changing
>>> them and don't want to execute stale bytes.
>>>
>>> What it does on most machines is call CPUID which is a serializing
>>> instruction. And that's expensive.
>>>
>>> However, the instruction cache is serialized when we're on the local CPU
>>> and are changing the data through the same virtual address.
>> Do you have the background of this "same virtual address"
>> constraint?
> 
> There was a long LKML thread on the subject. 
> https://lkml.org/lkml/2016/12/3/108 

Well, interesting reading (and at least part of it was Cc-ed to
xen-devel iirc), but none of it nor ...

>> Caches are physically indexed, so I don't see the
>> connection. Yet if there is one, our stub generation in the
>> emulator may have an issue.
> 
> I think https://lkml.org/lkml/2016/12/2/454 is probably the relevant
> statement.

... this one doesn't give any background at all of why the
virtual address would matter here. Searching the SDM I also can't
find any statement as to virtual or physical indexing being used
for any of the caches.

>>> --- a/xen/arch/x86/alternative.c
>>> +++ b/xen/arch/x86/alternative.c
>>> @@ -128,13 +128,14 @@ void init_or_livepatch add_nops(void *insns, unsigned int len)
>>>   *
>>>   * You should run this with interrupts disabled or on code that is not
>>>   * executing.
>>> + *
>>> + * "noinline" to cause control flow change and thus invalidate I$ and
>>> + * cause refetch after modification.
>>>   */
>>> -static void *init_or_livepatch text_poke(void *addr, const void *opcode, 
>>> size_t len)
>>> +static void *init_or_livepatch noinline
>>> +text_poke(void *addr, const void *opcode, size_t len)
>>>  {
>>> -    memcpy(addr, opcode, len);
>>> -    sync_core();
>>> -
>>> -    return addr;
>>> +    return memcpy(addr, opcode, len);
>>>  }
>> What if this is patching memcpy() itself?
> 
> That would be fine, because its still the same linear addresses, and
> would still be synchronised.
> 
> In reality for livepatching, we only modify the 5 byte prefix at the
> head of the functions, so this memcpy() wouldn't actually hit itself.
> 
> Of course, you could construct a patch which would cause memcpy() to
> actually hit itself, but that isn't going to work at all, irrespective
> of serialising instructions.

Hmm, my question was more from an alternative instruction
patching viewpoint.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
  2017-05-22 14:17     ` Jan Beulich
@ 2017-05-22 14:26       ` Juergen Gross
  2017-05-22 15:29         ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Juergen Gross @ 2017-05-22 14:26 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Wei Liu, Xen-devel

On 22/05/17 16:17, Jan Beulich wrote:
>>>> On 22.05.17 at 15:51, <andrew.cooper3@citrix.com> wrote:
>> On 22/05/17 14:38, Jan Beulich wrote:
>>>>>> On 19.05.17 at 20:49, <andrew.cooper3@citrix.com> wrote:
>>>> We use sync_core() in the alternatives code to stop speculative
>>>> execution of prefetched instructions because we are potentially changing
>>>> them and don't want to execute stale bytes.
>>>>
>>>> What it does on most machines is call CPUID which is a serializing
>>>> instruction. And that's expensive.
>>>>
>>>> However, the instruction cache is serialized when we're on the local CPU
>>>> and are changing the data through the same virtual address.
>>> Do you have the background of this "same virtual address"
>>> constraint?
>>
>> There was a long LKML thread on the subject. 
>> https://lkml.org/lkml/2016/12/3/108 
> 
> Well, interesting reading (and at least part of it was Cc-ed to
> xen-devel iirc), but none of it nor ...
> 
>>> Caches are physically indexed, so I don't see the
>>> connection. Yet if there is one, our stub generation in the
>>> emulator may have an issue.
>>
>> I think https://lkml.org/lkml/2016/12/2/454 is probably the relevant
>> statement.
> 
> ... this one doesn't give any background at all of why the
> virtual address would matter here. Searching the SDM I also can't
> find any statement as to virtual or physical indexing being used
> for any of the caches.

SDM Vol. 3 chapter 11.6 (self modifying code):

A write to a memory location in a code segment that is currently cached
in the processor causes the associated
cache line (or lines) to be invalidated. This check is based on the
physical address of the instruction. In addition,
the P6 family and Pentium processors check whether a write to a code
segment may modify an instruction that has
been prefetched for execution. If the write affects a prefetched
instruction, the prefetch queue is invalidated. This
latter check is based on the linear address of the instruction.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$
  2017-05-22 14:26       ` Juergen Gross
@ 2017-05-22 15:29         ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2017-05-22 15:29 UTC (permalink / raw)
  To: Juergen Gross; +Cc: Andrew Cooper, Wei Liu, Xen-devel

>>> On 22.05.17 at 16:26, <jgross@suse.com> wrote:
> On 22/05/17 16:17, Jan Beulich wrote:
>>>>> On 22.05.17 at 15:51, <andrew.cooper3@citrix.com> wrote:
>>> On 22/05/17 14:38, Jan Beulich wrote:
>>>>>>> On 19.05.17 at 20:49, <andrew.cooper3@citrix.com> wrote:
>>>>> We use sync_core() in the alternatives code to stop speculative
>>>>> execution of prefetched instructions because we are potentially changing
>>>>> them and don't want to execute stale bytes.
>>>>>
>>>>> What it does on most machines is call CPUID which is a serializing
>>>>> instruction. And that's expensive.
>>>>>
>>>>> However, the instruction cache is serialized when we're on the local CPU
>>>>> and are changing the data through the same virtual address.
>>>> Do you have the background of this "same virtual address"
>>>> constraint?
>>>
>>> There was a long LKML thread on the subject. 
>>> https://lkml.org/lkml/2016/12/3/108 
>> 
>> Well, interesting reading (and at least part of it was Cc-ed to
>> xen-devel iirc), but none of it nor ...
>> 
>>>> Caches are physically indexed, so I don't see the
>>>> connection. Yet if there is one, our stub generation in the
>>>> emulator may have an issue.
>>>
>>> I think https://lkml.org/lkml/2016/12/2/454 is probably the relevant
>>> statement.
>> 
>> ... this one doesn't give any background at all of why the
>> virtual address would matter here. Searching the SDM I also can't
>> find any statement as to virtual or physical indexing being used
>> for any of the caches.
> 
> SDM Vol. 3 chapter 11.6 (self modifying code):
> 
> A write to a memory location in a code segment that is currently cached
> in the processor causes the associated
> cache line (or lines) to be invalidated. This check is based on the
> physical address of the instruction. In addition,
> the P6 family and Pentium processors check whether a write to a code
> segment may modify an instruction that has
> been prefetched for execution. If the write affects a prefetched
> instruction, the prefetch queue is invalidated. This
> latter check is based on the linear address of the instruction.

Right, but 64-bit processors aren't P6 family (at least per the usual
grouping, where Pentium 4, Xeon, Core, and Atom processors are
all sibling groups to "P6 family", even if the family numbers of all half
way recent processors have been 6, leaving Xeon Phi aside). (That
aside I'd of course expect physical vs virtual indexing information in
e.g. the table titled "Characteristics of the Caches, TLBs, Store
Buffer, and ...")

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-05-22 15:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-19 18:49 [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Andrew Cooper
2017-05-19 18:49 ` [PATCH for-next 2/2] xen/x86: Drop sync_core() Andrew Cooper
2017-05-22 13:39   ` Jan Beulich
2017-05-20  0:29 ` [PATCH for-next 1/2] xen/x86/alternatives: Do not use sync_core() to serialize I$ Konrad Rzeszutek Wilk
2017-05-22 13:38 ` Jan Beulich
2017-05-22 13:51   ` Andrew Cooper
2017-05-22 14:17     ` Jan Beulich
2017-05-22 14:26       ` Juergen Gross
2017-05-22 15:29         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).