[PATCH] x86/special_insn: reverse __force

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] x86/special_insn: reverse __force_order logic
@ 2020-09-01 16:18 Nadav Amit
  2020-09-02 12:45 ` hpa
  2020-09-02 12:54 ` peterz
  0 siblings, 2 replies; 6+ messages in thread
From: Nadav Amit @ 2020-09-01 16:18 UTC (permalink / raw)
  To: linux-kernel, Thomas Gleixner
  Cc: Nadav Amit, Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin,
	Peter Zijlstra, Andy Lutomirski, Kees Cook, stable

From: Nadav Amit <namit@vmware.com>

The __force_order logic seems to be inverted. __force_order is
supposedly used to manipulate the compiler to use its memory
dependencies analysis to enforce orders between CR writes and reads.
Therefore, the memory should behave as a "CR": when the CR is read, the
memory should be "read" by the inline assembly, and __force_order should
be an output. When the CR is written, the memory should be "written".

This change should allow to remove the "volatile" qualifier from CR
reads at a later patch.

While at it, remove the extra new-line from the inline assembly, as it
only confuses GCC when it estimates the cost of the inline assembly.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>

---

Unless I misunderstand the logic, __force_order should also be used by
rdpkru() and wrpkru() which do not have dependency on __force_order. I
also did not understand why native_write_cr0() has R/W dependency on
__force_order, and why native_write_cr4() no longer has any dependency
on __force_order.
---
 arch/x86/include/asm/special_insns.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 5999b0b3dd4a..dff5e5b01a3c 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -24,32 +24,32 @@ void native_write_cr0(unsigned long val);
 static inline unsigned long native_read_cr0(void)
 {
 	unsigned long val;
-	asm volatile("mov %%cr0,%0\n\t" : "=r" (val), "=m" (__force_order));
+	asm volatile("mov %%cr0,%0" : "=r" (val) : "m" (__force_order));
 	return val;
 }
 
 static __always_inline unsigned long native_read_cr2(void)
 {
 	unsigned long val;
-	asm volatile("mov %%cr2,%0\n\t" : "=r" (val), "=m" (__force_order));
+	asm volatile("mov %%cr2,%0" : "=r" (val) : "m" (__force_order));
 	return val;
 }
 
 static __always_inline void native_write_cr2(unsigned long val)
 {
-	asm volatile("mov %0,%%cr2": : "r" (val), "m" (__force_order));
+	asm volatile("mov %1,%%cr2" : "=m" (__force_order) : "r" (val));
 }
 
 static inline unsigned long __native_read_cr3(void)
 {
 	unsigned long val;
-	asm volatile("mov %%cr3,%0\n\t" : "=r" (val), "=m" (__force_order));
+	asm volatile("mov %%cr3,%0" : "=r" (val) : "m" (__force_order));
 	return val;
 }
 
 static inline void native_write_cr3(unsigned long val)
 {
-	asm volatile("mov %0,%%cr3": : "r" (val), "m" (__force_order));
+	asm volatile("mov %1,%%cr3" : "=m" (__force_order) : "r" (val));
 }
 
 static inline unsigned long native_read_cr4(void)
@@ -64,10 +64,10 @@ static inline unsigned long native_read_cr4(void)
 	asm volatile("1: mov %%cr4, %0\n"
 		     "2:\n"
 		     _ASM_EXTABLE(1b, 2b)
-		     : "=r" (val), "=m" (__force_order) : "0" (0));
+		     : "=r" (val) : "m" (__force_order), "0" (0));
 #else
 	/* CR4 always exists on x86_64. */
-	asm volatile("mov %%cr4,%0\n\t" : "=r" (val), "=m" (__force_order));
+	asm volatile("mov %%cr4,%0" : "=r" (val) : "m" (__force_order));
 #endif
 	return val;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/special_insn: reverse __force_order logic
  2020-09-01 16:18 [PATCH] x86/special_insn: reverse __force_order logic Nadav Amit
@ 2020-09-02 12:45 ` hpa
  2020-09-02 12:54 ` peterz
  1 sibling, 0 replies; 6+ messages in thread
From: hpa @ 2020-09-02 12:45 UTC (permalink / raw)
  To: Nadav Amit, linux-kernel, Thomas Gleixner
  Cc: Nadav Amit, Ingo Molnar, Borislav Petkov, x86, Peter Zijlstra,
	Andy Lutomirski, Kees Cook, stable

On September 1, 2020 9:18:57 AM PDT, Nadav Amit <nadav.amit@gmail.com> wrote:
>From: Nadav Amit <namit@vmware.com>
>
>The __force_order logic seems to be inverted. __force_order is
>supposedly used to manipulate the compiler to use its memory
>dependencies analysis to enforce orders between CR writes and reads.
>Therefore, the memory should behave as a "CR": when the CR is read, the
>memory should be "read" by the inline assembly, and __force_order
>should
>be an output. When the CR is written, the memory should be "written".
>
>This change should allow to remove the "volatile" qualifier from CR
>reads at a later patch.
>
>While at it, remove the extra new-line from the inline assembly, as it
>only confuses GCC when it estimates the cost of the inline assembly.
>
>Cc: Thomas Gleixner <tglx@linutronix.de>
>Cc: Ingo Molnar <mingo@redhat.com>
>Cc: Borislav Petkov <bp@alien8.de>
>Cc: x86@kernel.org
>Cc: "H. Peter Anvin" <hpa@zytor.com>
>Cc: Peter Zijlstra <peterz@infradead.org>
>Cc: Andy Lutomirski <luto@kernel.org>
>Cc: Kees Cook <keescook@chromium.org>
>Cc: stable@vger.kernel.org
>Signed-off-by: Nadav Amit <namit@vmware.com>
>
>---
>
>Unless I misunderstand the logic, __force_order should also be used by
>rdpkru() and wrpkru() which do not have dependency on __force_order. I
>also did not understand why native_write_cr0() has R/W dependency on
>__force_order, and why native_write_cr4() no longer has any dependency
>on __force_order.
>---
> arch/x86/include/asm/special_insns.h | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
>diff --git a/arch/x86/include/asm/special_insns.h
>b/arch/x86/include/asm/special_insns.h
>index 5999b0b3dd4a..dff5e5b01a3c 100644
>--- a/arch/x86/include/asm/special_insns.h
>+++ b/arch/x86/include/asm/special_insns.h
>@@ -24,32 +24,32 @@ void native_write_cr0(unsigned long val);
> static inline unsigned long native_read_cr0(void)
> {
> 	unsigned long val;
>-	asm volatile("mov %%cr0,%0\n\t" : "=r" (val), "=m" (__force_order));
>+	asm volatile("mov %%cr0,%0" : "=r" (val) : "m" (__force_order));
> 	return val;
> }
> 
> static __always_inline unsigned long native_read_cr2(void)
> {
> 	unsigned long val;
>-	asm volatile("mov %%cr2,%0\n\t" : "=r" (val), "=m" (__force_order));
>+	asm volatile("mov %%cr2,%0" : "=r" (val) : "m" (__force_order));
> 	return val;
> }
> 
> static __always_inline void native_write_cr2(unsigned long val)
> {
>-	asm volatile("mov %0,%%cr2": : "r" (val), "m" (__force_order));
>+	asm volatile("mov %1,%%cr2" : "=m" (__force_order) : "r" (val));
> }
> 
> static inline unsigned long __native_read_cr3(void)
> {
> 	unsigned long val;
>-	asm volatile("mov %%cr3,%0\n\t" : "=r" (val), "=m" (__force_order));
>+	asm volatile("mov %%cr3,%0" : "=r" (val) : "m" (__force_order));
> 	return val;
> }
> 
> static inline void native_write_cr3(unsigned long val)
> {
>-	asm volatile("mov %0,%%cr3": : "r" (val), "m" (__force_order));
>+	asm volatile("mov %1,%%cr3" : "=m" (__force_order) : "r" (val));
> }
> 
> static inline unsigned long native_read_cr4(void)
>@@ -64,10 +64,10 @@ static inline unsigned long native_read_cr4(void)
> 	asm volatile("1: mov %%cr4, %0\n"
> 		     "2:\n"
> 		     _ASM_EXTABLE(1b, 2b)
>-		     : "=r" (val), "=m" (__force_order) : "0" (0));
>+		     : "=r" (val) : "m" (__force_order), "0" (0));
> #else
> 	/* CR4 always exists on x86_64. */
>-	asm volatile("mov %%cr4,%0\n\t" : "=r" (val), "=m" (__force_order));
>+	asm volatile("mov %%cr4,%0" : "=r" (val) : "m" (__force_order));
> #endif
> 	return val;
> }

This seems reasonable to me, and I fully agree with you that the logic seems to be just plain wrong (and unnoticed since all volatile operations are strictly ordered anyway), but you better not remove the volatile from cr2 read... unlike the other CRs that one is written by hardware and so genuinely is volatile.

However, I do believe "=m" is at least theoretically wrong. You are only writing *part* of the total state represented by this token (you can think of it as equivalent to a bitop), so it should be "+m".
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/special_insn: reverse __force_order logic
  2020-09-01 16:18 [PATCH] x86/special_insn: reverse __force_order logic Nadav Amit
  2020-09-02 12:45 ` hpa
@ 2020-09-02 12:54 ` peterz
  2020-09-02 15:32   ` Nadav Amit
  1 sibling, 1 reply; 6+ messages in thread
From: peterz @ 2020-09-02 12:54 UTC (permalink / raw)
  To: Nadav Amit
  Cc: linux-kernel, Thomas Gleixner, Nadav Amit, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Andy Lutomirski, Kees Cook,
	stable

On Tue, Sep 01, 2020 at 09:18:57AM -0700, Nadav Amit wrote:

> Unless I misunderstand the logic, __force_order should also be used by
> rdpkru() and wrpkru() which do not have dependency on __force_order. I
> also did not understand why native_write_cr0() has R/W dependency on
> __force_order, and why native_write_cr4() no longer has any dependency
> on __force_order.

There was a fairly large thread about this thing here:

  https://lkml.kernel.org/r/20200527135329.1172644-1-arnd@arndb.de

I didn't keep up, but I think the general concensus was that it's
indeed a bit naf.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/special_insn: reverse __force_order logic
  2020-09-02 12:54 ` peterz
@ 2020-09-02 15:32   ` Nadav Amit
  2020-09-02 16:56     ` peterz
  0 siblings, 1 reply; 6+ messages in thread
From: Nadav Amit @ 2020-09-02 15:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Andy Lutomirski, Kees Cook,
	stable@vger.kernel.org

> On Sep 2, 2020, at 5:54 AM, peterz@infradead.org wrote:
> 
> On Tue, Sep 01, 2020 at 09:18:57AM -0700, Nadav Amit wrote:
> 
>> Unless I misunderstand the logic, __force_order should also be used by
>> rdpkru() and wrpkru() which do not have dependency on __force_order. I
>> also did not understand why native_write_cr0() has R/W dependency on
>> __force_order, and why native_write_cr4() no longer has any dependency
>> on __force_order.
> 
> There was a fairly large thread about this thing here:
> 
>  https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.kernel.org%2Fr%2F20200527135329.1172644-1-arnd%40arndb.de&amp;data=02%7C01%7Cnamit%40vmware.com%7C387b68745a984454d50708d84f3f499c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637346480527901622&amp;sdata=PR%2BCUy%2FYNKza6he78coaDR3x1G7BYuzFfS9SGfWZ9p8%3D&amp;reserved=0
> 
> I didn't keep up, but I think the general concensus was that it's
> indeed a bit naf.

Thanks for pointer. I did not see the discussion, and embarrassingly, I have
also never figured out how to reply on lkml emails without registering to
lkml.

Anyhow, indeed, the patch that Arvind provided seems to address this issue.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/special_insn: reverse __force_order logic
  2020-09-02 15:32   ` Nadav Amit
@ 2020-09-02 16:56     ` peterz
  2020-09-02 16:58       ` Nadav Amit
  0 siblings, 1 reply; 6+ messages in thread
From: peterz @ 2020-09-02 16:56 UTC (permalink / raw)
  To: Nadav Amit
  Cc: LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Andy Lutomirski, Kees Cook,
	stable@vger.kernel.org

On Wed, Sep 02, 2020 at 03:32:18PM +0000, Nadav Amit wrote:

> Thanks for pointer. I did not see the discussion, and embarrassingly, I have
> also never figured out how to reply on lkml emails without registering to
> lkml.

The lore.kernel.org thing I pointed you to allows you to download an
mbox with the email in (see the very bottom of the page). Use your
favorite MUA to open it and reply :-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/special_insn: reverse __force_order logic
  2020-09-02 16:56     ` peterz
@ 2020-09-02 16:58       ` Nadav Amit
  0 siblings, 0 replies; 6+ messages in thread
From: Nadav Amit @ 2020-09-02 16:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Andy Lutomirski, Kees Cook,
	stable@vger.kernel.org

> On Sep 2, 2020, at 9:56 AM, peterz@infradead.org wrote:
> 
> On Wed, Sep 02, 2020 at 03:32:18PM +0000, Nadav Amit wrote:
> 
>> Thanks for pointer. I did not see the discussion, and embarrassingly, I have
>> also never figured out how to reply on lkml emails without registering to
>> lkml.
> 
> The lore.kernel.org thing I pointed you to allows you to download an
> mbox with the email in (see the very bottom of the page). Use your
> favorite MUA to open it and reply :-)

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-02 17:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-01 16:18 [PATCH] x86/special_insn: reverse __force_order logic Nadav Amit
2020-09-02 12:45 ` hpa
2020-09-02 12:54 ` peterz
2020-09-02 15:32   ` Nadav Amit
2020-09-02 16:56     ` peterz
2020-09-02 16:58       ` Nadav Amit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox