xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations
@ 2018-03-07 15:51 Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 1/7] x86/alt: Drop unused alternative infrastructure Andrew Cooper
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This is the end result of a lot of work I started during the Spectre/Meltdown
embargo window, and deferred because it was taking too long.  It finally
resolves the explict padding calculations for the SPEC_CTRL alternatives.

The only changes from v3 are in patches 5 and 7.

http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/alternatives-v3

Andrew Cooper (7):
  x86/alt: Drop unused alternative infrastructure
  x86/alt: Clean up struct alt_instr and its users
  x86/alt: Clean up the assembly used to generate alternatives
  x86/asm: Remove opencoded uses of altinstruction_entry
  x86/alt: Support for automatic padding calculations
  x86/alt: Drop explicit padding of origin sites
  x86/build: Use new .nops directive when available

 xen/arch/x86/Rules.mk                 |   8 ++
 xen/arch/x86/alternative.c            |  62 ++++++++++----
 xen/arch/x86/x86_64/compat/entry.S    |  26 ++----
 xen/arch/x86/x86_64/entry.S           |  20 +----
 xen/include/asm-x86/alternative-asm.h | 106 ++++++++++++++++-------
 xen/include/asm-x86/alternative.h     | 154 +++++++++++++++++++---------------
 xen/include/asm-x86/asm_defns.h       |  34 +++-----
 xen/include/asm-x86/nops.h            |   7 --
 xen/include/asm-x86/spec_ctrl_asm.h   |  19 ++---
 9 files changed, 250 insertions(+), 186 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/7] x86/alt: Drop unused alternative infrastructure
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 2/7] x86/alt: Clean up struct alt_instr and its users Andrew Cooper
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

ALTERNATIVE_3 is more complicated than ALTERNATIVE_2 when it comes to
calculating extra padding length, and we have no need for the complexity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2:
 * Retain ASM_OUTPUT2()
---
 xen/include/asm-x86/alternative.h | 22 ----------------------
 1 file changed, 22 deletions(-)

diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index ba537d6..325a29f 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -65,11 +65,6 @@ extern void alternative_instructions(void);
 	ALTERNATIVE(oldinstr, newinstr1, feature1)			  \
 	ALTERNATIVE_N(newinstr2, feature2, 2)
 
-#define ALTERNATIVE_3(oldinstr, newinstr1, feature1, newinstr2, feature2, \
-		      newinstr3, feature3)				  \
-	ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-	ALTERNATIVE_N(newinstr3, feature3, 3)
-
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -118,23 +113,6 @@ extern void alternative_instructions(void);
 				   newinstr2, feature2)			\
 		     : output : input)
 
-/*
- * This is similar to alternative_io. But it has three features and
- * respective instructions.
- *
- * If CPU has feature3, newinstr3 is used.
- * Otherwise, if CPU has feature2, newinstr2 is used.
- * Otherwise, if CPU has feature1, newinstr1 is used.
- * Otherwise, oldinstr is used.
- */
-#define alternative_io_3(oldinstr, newinstr1, feature1, newinstr2,	\
-			 feature2, newinstr3, feature3, output,		\
-			 input...)					\
-	asm volatile(ALTERNATIVE_3(oldinstr, newinstr1, feature1,	\
-				   newinstr2, feature2, newinstr3,	\
-				   feature3)				\
-		     : output : input)
-
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/7] x86/alt: Clean up struct alt_instr and its users
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 1/7] x86/alt: Drop unused alternative infrastructure Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 3/7] x86/alt: Clean up the assembly used to generate alternatives Andrew Cooper
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

 * Rename some fields for consistency and clarity, and use standard types.
 * Don't opencode the use of ALT_{ORIG,REPL}_PTR().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2:
 * Change more types to standard ones
---
 xen/arch/x86/alternative.c        | 24 ++++++++++++------------
 xen/include/asm-x86/alternative.h | 14 +++++++-------
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 5c8b6f6..51ca53e 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -163,8 +163,6 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
                                           const struct alt_instr *end)
 {
     const struct alt_instr *a;
-    u8 *instr, *replacement;
-    u8 insnbuf[MAX_PATCH_LEN];
 
     printk(KERN_INFO "alt table %p -> %p\n", start, end);
 
@@ -179,23 +177,25 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
      */
     for ( a = start; a < end; a++ )
     {
-        instr = (u8 *)&a->instr_offset + a->instr_offset;
-        replacement = (u8 *)&a->repl_offset + a->repl_offset;
-        BUG_ON(a->replacementlen > a->instrlen);
-        BUG_ON(a->instrlen > sizeof(insnbuf));
+        uint8_t *orig = ALT_ORIG_PTR(a);
+        uint8_t *repl = ALT_REPL_PTR(a);
+        uint8_t buf[MAX_PATCH_LEN];
+
+        BUG_ON(a->repl_len > a->orig_len);
+        BUG_ON(a->orig_len > sizeof(buf));
         BUG_ON(a->cpuid >= NCAPINTS * 32);
+
         if ( !boot_cpu_has(a->cpuid) )
             continue;
 
-        memcpy(insnbuf, replacement, a->replacementlen);
+        memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
-        if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 )
-            *(s32 *)(insnbuf + 1) += replacement - instr;
+        if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
+            *(int32_t *)(buf + 1) += repl - orig;
 
-        add_nops(insnbuf + a->replacementlen,
-                 a->instrlen - a->replacementlen);
-        text_poke(instr, insnbuf, a->instrlen);
+        add_nops(buf + a->repl_len, a->orig_len - a->repl_len);
+        text_poke(orig, buf, a->orig_len);
     }
 }
 
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index 325a29f..d9706fd 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -9,15 +9,15 @@
 #include <xen/types.h>
 
 struct alt_instr {
-    s32 instr_offset;       /* original instruction */
-    s32 repl_offset;        /* offset to replacement instruction */
-    u16 cpuid;              /* cpuid bit set for replacement */
-    u8  instrlen;           /* length of original instruction */
-    u8  replacementlen;     /* length of new instruction, <= instrlen */
+    int32_t  orig_offset;   /* original instruction */
+    int32_t  repl_offset;   /* offset to replacement instruction */
+    uint16_t cpuid;         /* cpuid bit set for replacement */
+    uint8_t  orig_len;      /* length of original instruction */
+    uint8_t  repl_len;      /* length of new instruction, <= instrlen */
 };
 
-#define __ALT_PTR(a,f)      ((u8 *)((void *)&(a)->f + (a)->f))
-#define ALT_ORIG_PTR(a)     __ALT_PTR(a, instr_offset)
+#define __ALT_PTR(a,f)      ((uint8_t *)((void *)&(a)->f + (a)->f))
+#define ALT_ORIG_PTR(a)     __ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a)     __ALT_PTR(a, repl_offset)
 
 extern void add_nops(void *insns, unsigned int len);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/7] x86/alt: Clean up the assembly used to generate alternatives
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 1/7] x86/alt: Drop unused alternative infrastructure Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 2/7] x86/alt: Clean up struct alt_instr and its users Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 4/7] x86/asm: Remove opencoded uses of altinstruction_entry Andrew Cooper
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

 * On the C side, switch to using local lables rather than hardcoded numbers.
 * Rename parameters and lables to be consistent with alt_instr names, and
   consistent between the the C and asm versions.
 * On the asm side, factor some expressions out into macros to aid clarity.
 * Consistently declare section attributes.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v2:
 * repl_{s,e} => alt_repl_{s,e}
 * Use .LXEN%= prefix for the C lables
v3:
 * Factor our decl_orig() on the ASM side
---
 xen/include/asm-x86/alternative-asm.h | 60 +++++++++++++++++---------------
 xen/include/asm-x86/alternative.h     | 65 +++++++++++++++++++----------------
 2 files changed, 68 insertions(+), 57 deletions(-)

diff --git a/xen/include/asm-x86/alternative-asm.h b/xen/include/asm-x86/alternative-asm.h
index 6640e85..2af4f6b 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -9,60 +9,66 @@
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig alt feature orig_len alt_len
+.macro altinstruction_entry orig repl feature orig_len repl_len
     .long \orig - .
-    .long \alt - .
+    .long \repl - .
     .word \feature
     .byte \orig_len
-    .byte \alt_len
+    .byte \repl_len
 .endm
 
+#define decl_orig(insn)         .L\@_orig_s:      insn; .L\@_orig_e:
+#define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
+
+#define decl_repl(insn, nr)     .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:
+#define repl_len(nr)           (.L\@_repl_e\()nr  -     .L\@_repl_s\()nr)
+
 .macro ALTERNATIVE oldinstr, newinstr, feature
-.Lold_start_\@:
-    \oldinstr
-.Lold_end_\@:
+    decl_orig(\oldinstr)
 
     .pushsection .altinstructions, "a", @progbits
-    altinstruction_entry .Lold_start_\@, .Lnew_start_\@, \feature, \
-        (.Lold_end_\@ - .Lold_start_\@), (.Lnew_end_\@ - .Lnew_start_\@)
+    altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature, \
+        orig_len, repl_len(1)
 
     .section .discard, "a", @progbits
     /* Assembler-time check that \newinstr isn't longer than \oldinstr. */
-    .byte 0xff + (.Lnew_end_\@ - .Lnew_start_\@) - (.Lold_end_\@ - .Lold_start_\@)
+    .byte 0xff + repl_len(1) - orig_len
 
     .section .altinstr_replacement, "ax", @progbits
-.Lnew_start_\@:
-    \newinstr
-.Lnew_end_\@:
+
+    decl_repl(\newinstr, 1)
+
     .popsection
 .endm
 
 .macro ALTERNATIVE_2 oldinstr, newinstr1, feature1, newinstr2, feature2
-.Lold_start_\@:
-    \oldinstr
-.Lold_end_\@:
+    decl_orig(\oldinstr)
 
     .pushsection .altinstructions, "a", @progbits
-    altinstruction_entry .Lold_start_\@, .Lnew1_start_\@, \feature1, \
-        (.Lold_end_\@ - .Lold_start_\@), (.Lnew1_end_\@ - .Lnew1_start_\@)
-    altinstruction_entry .Lold_start_\@, .Lnew2_start_\@, \feature2, \
-        (.Lold_end_\@ - .Lold_start_\@), (.Lnew2_end_\@ - .Lnew2_start_\@)
+
+    altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature1, \
+        orig_len, repl_len(1)
+    altinstruction_entry .L\@_orig_s, .L\@_repl_s2, \feature2, \
+        orig_len, repl_len(2)
 
     .section .discard, "a", @progbits
     /* Assembler-time check that \newinstr{1,2} aren't longer than \oldinstr. */
-    .byte 0xff + (.Lnew1_end_\@ - .Lnew1_start_\@) - (.Lold_end_\@ - .Lold_start_\@)
-    .byte 0xff + (.Lnew2_end_\@ - .Lnew2_start_\@) - (.Lold_end_\@ - .Lold_start_\@)
+    .byte 0xff + repl_len(1) - orig_len
+    .byte 0xff + repl_len(2) - orig_len
 
     .section .altinstr_replacement, "ax", @progbits
-.Lnew1_start_\@:
-    \newinstr1
-.Lnew1_end_\@:
-.Lnew2_start_\@:
-    \newinstr2
-.Lnew2_end_\@:
+
+    decl_repl(\newinstr1, 1)
+    decl_repl(\newinstr2, 2)
+
     .popsection
 .endm
 
+#undef repl_len
+#undef decl_repl
+#undef orig_len
+#undef decl_orig
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_ALTERNATIVE_ASM_H_ */
 
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index d9706fd..bcad3ee 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -26,44 +26,49 @@ extern void apply_alternatives(const struct alt_instr *start,
                                const struct alt_instr *end);
 extern void alternative_instructions(void);
 
-#define OLDINSTR(oldinstr)      "661:\n\t" oldinstr "\n662:\n"
+#define OLDINSTR(oldinstr) ".LXEN%=_orig_s:\n\t" oldinstr "\n.LXEN%=_orig_e:\n"
 
-#define b_replacement(number)   "663"#number
-#define e_replacement(number)   "664"#number
+#define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
+#define alt_repl_s(num)    ".LXEN%=_repl_s"#num
+#define alt_repl_e(num)    ".LXEN%=_repl_e"#num
+#define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
 
-#define alt_slen "662b-661b"
-#define alt_rlen(number) e_replacement(number)"f-"b_replacement(number)"f"
+#define ALTINSTR_ENTRY(feature, num)                                    \
+        " .long .LXEN%=_orig_s - .\n"             /* label           */ \
+        " .long " alt_repl_s(num)" - .\n"         /* new instruction */ \
+        " .word " __stringify(feature) "\n"       /* feature bit     */ \
+        " .byte " alt_orig_len "\n"               /* source len      */ \
+        " .byte " alt_repl_len(num) "\n"          /* replacement len */
 
-#define ALTINSTR_ENTRY(feature, number)                                       \
-        " .long 661b - .\n"                             /* label           */ \
-        " .long " b_replacement(number)"f - .\n"        /* new instruction */ \
-        " .word " __stringify(feature) "\n"             /* feature bit     */ \
-        " .byte " alt_slen "\n"                         /* source len      */ \
-        " .byte " alt_rlen(number) "\n"                 /* replacement len */
+#define DISCARD_ENTRY(num)                        /* repl <= orig */    \
+        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_orig_len ")\n"
 
-#define DISCARD_ENTRY(number)                           /* rlen <= slen */    \
-        " .byte 0xff + (" alt_rlen(number) ") - (" alt_slen ")\n"
-
-#define ALTINSTR_REPLACEMENT(newinstr, feature, number) /* replacement */     \
-        b_replacement(number)":\n\t" newinstr "\n" e_replacement(number) ":\n\t"
-
-#define ALTERNATIVE_N(newinstr, feature, number)	\
-	".pushsection .altinstructions,\"a\"\n"		\
-	ALTINSTR_ENTRY(feature, number)			\
-	".section .discard,\"a\",@progbits\n"		\
-	DISCARD_ENTRY(number)				\
-	".section .altinstr_replacement, \"ax\"\n"	\
-	ALTINSTR_REPLACEMENT(newinstr, feature, number)	\
-	".popsection\n"
+#define ALTINSTR_REPLACEMENT(newinstr, num)       /* replacement */     \
+        alt_repl_s(num)":\n\t" newinstr "\n" alt_repl_e(num) ":\n\t"
 
 /* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, feature)			  \
-	OLDINSTR(oldinstr)						  \
-	ALTERNATIVE_N(newinstr, feature, 1)
+#define ALTERNATIVE(oldinstr, newinstr, feature)                        \
+        OLDINSTR(oldinstr)                                              \
+        ".pushsection .altinstructions, \"a\", @progbits\n"             \
+        ALTINSTR_ENTRY(feature, 1)                                      \
+        ".section .discard, \"a\", @progbits\n"                         \
+        DISCARD_ENTRY(1)                                                \
+        ".section .altinstr_replacement, \"ax\", @progbits\n"           \
+        ALTINSTR_REPLACEMENT(newinstr, 1)                               \
+        ".popsection\n"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-	ALTERNATIVE(oldinstr, newinstr1, feature1)			  \
-	ALTERNATIVE_N(newinstr2, feature2, 2)
+        OLDINSTR(oldinstr)                                              \
+        ".pushsection .altinstructions, \"a\", @progbits\n"             \
+        ALTINSTR_ENTRY(feature1, 1)                                     \
+        ALTINSTR_ENTRY(feature2, 2)                                     \
+        ".section .discard, \"a\", @progbits\n"                         \
+        DISCARD_ENTRY(1)                                                \
+        DISCARD_ENTRY(2)                                                \
+        ".section .altinstr_replacement, \"ax\", @progbits\n"           \
+        ALTINSTR_REPLACEMENT(newinstr1, 1)                              \
+        ALTINSTR_REPLACEMENT(newinstr2, 2)                              \
+        ".popsection\n"
 
 /*
  * Alternative instructions for different CPU types or capabilities.
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 4/7] x86/asm: Remove opencoded uses of altinstruction_entry
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (2 preceding siblings ...)
  2018-03-07 15:51 ` [PATCH v3 3/7] x86/alt: Clean up the assembly used to generate alternatives Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 5/7] x86/alt: Support for automatic padding calculations Andrew Cooper
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

With future changes, altinstruction_entry is going to become more complicated
to use.  Furthermore, there are already ALTERNATIVE* macros which can be used
to avoid opencoding the creation of replacement information.

For ASM_STAC, ASM_CLAC and CR4_PV32_RESTORE, this means the removal of all
hardocded label numbers.  For the cr4_pv32 alternatives, this means hardcoding
the extra space required in the original patch site, but the hardcoding will
be removed by a later patch.

No change to any functionality, but the handling of nops inside the original
patch sites are a bit different.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/x86_64/compat/entry.S | 26 +++++++++-----------------
 xen/arch/x86/x86_64/entry.S        | 20 +++-----------------
 xen/include/asm-x86/asm_defns.h    | 34 ++++++++++++----------------------
 3 files changed, 24 insertions(+), 56 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index bf3a7ac..84f5eb1 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -111,13 +111,10 @@ ENTRY(compat_restore_all_guest)
         ASSERT_INTERRUPTS_DISABLED
         mov   $~(X86_EFLAGS_IOPL|X86_EFLAGS_NT|X86_EFLAGS_VM),%r11d
         and   UREGS_eflags(%rsp),%r11d
-.Lcr4_orig:
-        .skip .Lcr4_alt_end - .Lcr4_alt, 0x90
-.Lcr4_orig_end:
-        .pushsection .altinstr_replacement, "ax"
-.Lcr4_alt:
+
+.macro alt_cr4_pv32
         testb $3,UREGS_cs(%rsp)
-        jpe   .Lcr4_alt_end
+        jpe   2f
         mov   CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp), %rax
         and   $~XEN_CR4_PV32_BITS, %rax
 1:
@@ -135,17 +132,12 @@ ENTRY(compat_restore_all_guest)
          */
         cmp   %rax, CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp)
         jne   1b
-.Lcr4_alt_end:
-        .section .altinstructions, "a"
-        altinstruction_entry .Lcr4_orig, .Lcr4_orig, X86_FEATURE_ALWAYS, \
-                             (.Lcr4_orig_end - .Lcr4_orig), 0
-        altinstruction_entry .Lcr4_orig, .Lcr4_alt, X86_FEATURE_XEN_SMEP, \
-                             (.Lcr4_orig_end - .Lcr4_orig), \
-                             (.Lcr4_alt_end - .Lcr4_alt)
-        altinstruction_entry .Lcr4_orig, .Lcr4_alt, X86_FEATURE_XEN_SMAP, \
-                             (.Lcr4_orig_end - .Lcr4_orig), \
-                             (.Lcr4_alt_end - .Lcr4_alt)
-        .popsection
+2:
+.endm
+	ALTERNATIVE_2 ".skip 45, 0x90", \
+            alt_cr4_pv32, X86_FEATURE_XEN_SMEP, \
+            alt_cr4_pv32, X86_FEATURE_XEN_SMAP
+
         or    $X86_EFLAGS_IF,%r11
         mov   %r11d,UREGS_eflags(%rsp)
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 6493796..207c1e0 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -601,23 +601,9 @@ handle_exception_saved:
         testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
         jz    exception_with_ints_disabled
 
-.Lcr4_pv32_orig:
-        jmp   .Lcr4_pv32_done
-        .skip (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt) - (. - .Lcr4_pv32_orig), 0xcc
-        .pushsection .altinstr_replacement, "ax"
-.Lcr4_pv32_alt:
-        mov   VCPU_domain(%rbx),%rax
-.Lcr4_pv32_alt_end:
-        .section .altinstructions, "a"
-        altinstruction_entry .Lcr4_pv32_orig, .Lcr4_pv32_alt, \
-                             X86_FEATURE_XEN_SMEP, \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt), \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt)
-        altinstruction_entry .Lcr4_pv32_orig, .Lcr4_pv32_alt, \
-                             X86_FEATURE_XEN_SMAP, \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt), \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt)
-        .popsection
+        ALTERNATIVE_2 "jmp .Lcr4_pv32_done; .skip 2, 0x90", \
+            __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMEP, \
+            __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMAP
 
         testb $3,UREGS_cs(%rsp)
         jz    .Lcr4_pv32_done
diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index cc5ec65..f8128c0 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -195,28 +195,18 @@ void ret_from_intr(void);
 #define __ASM_STAC      .byte 0x0f,0x01,0xcb
 
 #ifdef __ASSEMBLY__
-#define ASM_AC(op)                                                     \
-        661: ASM_NOP3;                                                 \
-        .pushsection .altinstr_replacement, "ax";                      \
-        662: __ASM_##op;                                               \
-        .popsection;                                                   \
-        .pushsection .altinstructions, "a";                            \
-        altinstruction_entry 661b, 661b, X86_FEATURE_ALWAYS, 3, 0;     \
-        altinstruction_entry 661b, 662b, X86_FEATURE_XEN_SMAP, 3, 3;       \
-        .popsection
-
-#define ASM_STAC ASM_AC(STAC)
-#define ASM_CLAC ASM_AC(CLAC)
-
-#define CR4_PV32_RESTORE                                           \
-        667: ASM_NOP5;                                             \
-        .pushsection .altinstr_replacement, "ax";                  \
-        668: call cr4_pv32_restore;                                \
-        .section .altinstructions, "a";                            \
-        altinstruction_entry 667b, 667b, X86_FEATURE_ALWAYS, 5, 0; \
-        altinstruction_entry 667b, 668b, X86_FEATURE_XEN_SMEP, 5, 5;   \
-        altinstruction_entry 667b, 668b, X86_FEATURE_XEN_SMAP, 5, 5;   \
-        .popsection
+#define ASM_STAC                                        \
+    ALTERNATIVE __stringify(ASM_NOP3),                  \
+        __stringify(__ASM_STAC), X86_FEATURE_XEN_SMAP
+
+#define ASM_CLAC                                        \
+    ALTERNATIVE __stringify(ASM_NOP3),                  \
+        __stringify(__ASM_CLAC), X86_FEATURE_XEN_SMAP
+
+#define CR4_PV32_RESTORE                                \
+    ALTERNATIVE_2 __stringify(ASM_NOP5),                \
+        "call cr4_pv32_restore", X86_FEATURE_XEN_SMEP,  \
+        "call cr4_pv32_restore", X86_FEATURE_XEN_SMAP
 
 #else
 static always_inline void clac(void)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 5/7] x86/alt: Support for automatic padding calculations
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (3 preceding siblings ...)
  2018-03-07 15:51 ` [PATCH v3 4/7] x86/asm: Remove opencoded uses of altinstruction_entry Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-08 15:42   ` Jan Beulich
  2018-03-07 15:51 ` [PATCH v3 6/7] x86/alt: Drop explicit padding of origin sites Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 7/7] x86/build: Use new .nops directive when available Andrew Cooper
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich

The correct amount of padding in an origin patch site can be calculated
automatically, based on the relative lengths of the replacements.

This requires a bit of trickery to calculate correctly, especially in the
ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
calculation is further complicated because GAS's idea of true is -1 rather
than 1, which is why the extra negations are required.

Additionally, have apply_alternatives() attempt to optimise the padding nops.
This is complicated by the fact that we must not attempt to optimise nops over
an origin site which has already been modified.

To keep track of this, add a priv field to struct alt_instr, which gets
modified by apply_alternatives().  One extra requirement is that alt_instr's
referring to the same origin site must now be consecutive, but we already have
this property.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

v2:
 * Fix build with Clang
v3:
 * s/GCC/GAS/ for the NEGATIVE_TRUE comments
 * Factor out OLDINSTR() generation on the C side
 * Introduce alt_instr->priv and use it to avoid repeatedly optimising the
   alternatives.
---
 xen/arch/x86/Rules.mk                 |  4 +++
 xen/arch/x86/alternative.c            | 45 +++++++++++++++++++++++-----
 xen/include/asm-x86/alternative-asm.h | 56 +++++++++++++++++++++++++++--------
 xen/include/asm-x86/alternative.h     | 50 ++++++++++++++++++++++++-------
 4 files changed, 124 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index a29ab22..ac585a3 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -25,6 +25,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
                      '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@')
 $(call as-option-add,CFLAGS,CC,"invpcid (%rax)$$(comma)%rax",-DHAVE_AS_INVPCID)
 
+# GAS's idea of true is -1.  Clang's idea is 1
+$(call as-option-add,CFLAGS,CC,\
+    ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
+
 CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
 
 # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 51ca53e..93adf56 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -159,10 +159,10 @@ text_poke(void *addr, const void *opcode, size_t len)
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
  */
-void init_or_livepatch apply_alternatives(const struct alt_instr *start,
-                                          const struct alt_instr *end)
+void init_or_livepatch apply_alternatives(struct alt_instr *start,
+                                          struct alt_instr *end)
 {
-    const struct alt_instr *a;
+    struct alt_instr *a, *base;
 
     printk(KERN_INFO "alt table %p -> %p\n", start, end);
 
@@ -175,18 +175,47 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
      * So be careful if you want to change the scan order to any other
      * order.
      */
-    for ( a = start; a < end; a++ )
+    for ( a = base = start; a < end; a++ )
     {
         uint8_t *orig = ALT_ORIG_PTR(a);
         uint8_t *repl = ALT_REPL_PTR(a);
         uint8_t buf[MAX_PATCH_LEN];
+        unsigned int total_len = a->orig_len + a->pad_len;
 
-        BUG_ON(a->repl_len > a->orig_len);
-        BUG_ON(a->orig_len > sizeof(buf));
+        BUG_ON(a->repl_len > total_len);
+        BUG_ON(total_len > sizeof(buf));
         BUG_ON(a->cpuid >= NCAPINTS * 32);
 
+        /*
+         * Detect sequences of alt_instr's patching the same origin site, and
+         * keep base pointing at the first alt_instr entry.  This is so we can
+         * refer to a single ->priv field for patching decisions.
+         *
+         * ->priv being nonzero means that the origin site has already been
+         * modified, and we shouldn't try to optimise the nops again.
+         */
+        if ( ALT_ORIG_PTR(base) != orig )
+            base = a;
+
+        /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
+        {
+            /* Origin site site already touched?  Don't nop anything. */
+            if ( base->priv )
+                continue;
+
+            base->priv = 1;
+
+            /* Nothing useful to do? */
+            if ( a->pad_len <= 1 )
+                continue;
+
+            add_nops(buf, a->pad_len);
+            text_poke(orig + a->orig_len, buf, a->pad_len);
             continue;
+        }
+
+        base->priv = 1;
 
         memcpy(buf, repl, a->repl_len);
 
@@ -194,8 +223,8 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
             *(int32_t *)(buf + 1) += repl - orig;
 
-        add_nops(buf + a->repl_len, a->orig_len - a->repl_len);
-        text_poke(orig, buf, a->orig_len);
+        add_nops(buf + a->repl_len, total_len - a->repl_len);
+        text_poke(orig, buf, total_len);
     }
 }
 
diff --git a/xen/include/asm-x86/alternative-asm.h b/xen/include/asm-x86/alternative-asm.h
index 2af4f6b..0b61516 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -9,30 +9,53 @@
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig repl feature orig_len repl_len
+.macro altinstruction_entry orig repl feature orig_len repl_len pad_len
     .long \orig - .
     .long \repl - .
     .word \feature
     .byte \orig_len
     .byte \repl_len
+    .byte \pad_len
+    .byte 0 /* priv */
 .endm
 
-#define decl_orig(insn)         .L\@_orig_s:      insn; .L\@_orig_e:
+/* GAS's idea of true is -1, while Clang's idea is 1. */
+#ifdef HAVE_AS_NEGATIVE_TRUE
+# define as_true(x) (-(x))
+#else
+# define as_true(x) (x)
+#endif
+
+#define decl_orig(insn, padding)                  \
+ .L\@_orig_s: insn; .L\@_orig_e:                  \
+ .L\@_diff = padding;                             \
+ .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90;  \
+ .L\@_orig_p:
+
 #define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
+#define pad_len                (.L\@_orig_p       -     .L\@_orig_e)
+#define total_len              (.L\@_orig_p       -     .L\@_orig_s)
 
 #define decl_repl(insn, nr)     .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:
 #define repl_len(nr)           (.L\@_repl_e\()nr  -     .L\@_repl_s\()nr)
 
+#define as_max(a, b)           ((a) ^ (((a) ^ (b)) & -as_true((a) < (b))))
+
 .macro ALTERNATIVE oldinstr, newinstr, feature
-    decl_orig(\oldinstr)
+    decl_orig(\oldinstr, repl_len(1) - orig_len)
 
     .pushsection .altinstructions, "a", @progbits
     altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature, \
-        orig_len, repl_len(1)
+        orig_len, repl_len(1), pad_len
 
     .section .discard, "a", @progbits
-    /* Assembler-time check that \newinstr isn't longer than \oldinstr. */
-    .byte 0xff + repl_len(1) - orig_len
+    /*
+     * Assembler-time checks:
+     *   - total_len <= 255
+     *   - \newinstr <= total_len
+     */
+    .byte total_len
+    .byte 0xff + repl_len(1) - total_len
 
     .section .altinstr_replacement, "ax", @progbits
 
@@ -42,19 +65,24 @@
 .endm
 
 .macro ALTERNATIVE_2 oldinstr, newinstr1, feature1, newinstr2, feature2
-    decl_orig(\oldinstr)
+    decl_orig(\oldinstr, as_max(repl_len(1), repl_len(2)) - orig_len)
 
     .pushsection .altinstructions, "a", @progbits
 
     altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature1, \
-        orig_len, repl_len(1)
+        orig_len, repl_len(1), pad_len
     altinstruction_entry .L\@_orig_s, .L\@_repl_s2, \feature2, \
-        orig_len, repl_len(2)
+        orig_len, repl_len(2), pad_len
 
     .section .discard, "a", @progbits
-    /* Assembler-time check that \newinstr{1,2} aren't longer than \oldinstr. */
-    .byte 0xff + repl_len(1) - orig_len
-    .byte 0xff + repl_len(2) - orig_len
+    /*
+     * Assembler-time checks:
+     *   - total_len <= 255
+     *   - \newinstr* <= total_len
+     */
+    .byte total_len
+    .byte 0xff + repl_len(1) - total_len
+    .byte 0xff + repl_len(2) - total_len
 
     .section .altinstr_replacement, "ax", @progbits
 
@@ -64,10 +92,14 @@
     .popsection
 .endm
 
+#undef as_max
 #undef repl_len
 #undef decl_repl
+#undef total_len
+#undef pad_len
 #undef orig_len
 #undef decl_orig
+#undef as_true
 
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_ALTERNATIVE_ASM_H_ */
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index bcad3ee..4803368 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -8,12 +8,14 @@
 #include <xen/stringify.h>
 #include <xen/types.h>
 
-struct alt_instr {
+struct __packed alt_instr {
     int32_t  orig_offset;   /* original instruction */
     int32_t  repl_offset;   /* offset to replacement instruction */
     uint16_t cpuid;         /* cpuid bit set for replacement */
     uint8_t  orig_len;      /* length of original instruction */
-    uint8_t  repl_len;      /* length of new instruction, <= instrlen */
+    uint8_t  repl_len;      /* length of new instruction */
+    uint8_t  pad_len;       /* length of build-time padding */
+    uint8_t  priv;          /* Private, for use by apply_alternatives() */
 };
 
 #define __ALT_PTR(a,f)      ((uint8_t *)((void *)&(a)->f + (a)->f))
@@ -22,47 +24,73 @@ struct alt_instr {
 
 extern void add_nops(void *insns, unsigned int len);
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
-extern void apply_alternatives(const struct alt_instr *start,
-                               const struct alt_instr *end);
+extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
 
-#define OLDINSTR(oldinstr) ".LXEN%=_orig_s:\n\t" oldinstr "\n.LXEN%=_orig_e:\n"
-
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
+#define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
+#define alt_total_len      "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
 #define alt_repl_s(num)    ".LXEN%=_repl_s"#num
 #define alt_repl_e(num)    ".LXEN%=_repl_e"#num
 #define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
 
+/* GAS's idea of true is -1, while Clang's idea is 1. */
+#ifdef HAVE_AS_NEGATIVE_TRUE
+# define AS_TRUE "-"
+#else
+# define AS_TRUE ""
+#endif
+
+#define as_max(a, b) "(("a") ^ ((("a") ^ ("b")) & -("AS_TRUE"(("a") < ("b")))))"
+
+#define OLDINSTR(oldinstr, padding)                              \
+    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
+    ".LXEN%=_diff = " padding "\n\t"                             \
+    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+    ".LXEN%=_orig_p:\n\t"
+
+#define OLDINSTR_1(oldinstr, n1)                                 \
+    OLDINSTR(oldinstr, alt_repl_len(n1) "-" alt_orig_len)
+
+#define OLDINSTR_2(oldinstr, n1, n2)                             \
+    OLDINSTR(oldinstr,                                           \
+             as_max((alt_repl_len(n1),                           \
+                     alt_repl_len(n2)) "-" alt_orig_len))
+
 #define ALTINSTR_ENTRY(feature, num)                                    \
         " .long .LXEN%=_orig_s - .\n"             /* label           */ \
         " .long " alt_repl_s(num)" - .\n"         /* new instruction */ \
         " .word " __stringify(feature) "\n"       /* feature bit     */ \
         " .byte " alt_orig_len "\n"               /* source len      */ \
-        " .byte " alt_repl_len(num) "\n"          /* replacement len */
+        " .byte " alt_repl_len(num) "\n"          /* replacement len */ \
+        " .byte " alt_pad_len "\n"                /* padding len     */ \
+        " .byte 0\n"                              /* priv            */
 
-#define DISCARD_ENTRY(num)                        /* repl <= orig */    \
-        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_orig_len ")\n"
+#define DISCARD_ENTRY(num)                        /* repl <= total */   \
+        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_total_len ")\n"
 
 #define ALTINSTR_REPLACEMENT(newinstr, num)       /* replacement */     \
         alt_repl_s(num)":\n\t" newinstr "\n" alt_repl_e(num) ":\n\t"
 
 /* alternative assembly primitive: */
 #define ALTERNATIVE(oldinstr, newinstr, feature)                        \
-        OLDINSTR(oldinstr)                                              \
+        OLDINSTR_1(oldinstr, 1)                                         \
         ".pushsection .altinstructions, \"a\", @progbits\n"             \
         ALTINSTR_ENTRY(feature, 1)                                      \
         ".section .discard, \"a\", @progbits\n"                         \
+        ".byte " alt_total_len "\n" /* total_len <= 255 */              \
         DISCARD_ENTRY(1)                                                \
         ".section .altinstr_replacement, \"ax\", @progbits\n"           \
         ALTINSTR_REPLACEMENT(newinstr, 1)                               \
         ".popsection\n"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-        OLDINSTR(oldinstr)                                              \
+        OLDINSTR_2(oldinstr, 1, 2)                                      \
         ".pushsection .altinstructions, \"a\", @progbits\n"             \
         ALTINSTR_ENTRY(feature1, 1)                                     \
         ALTINSTR_ENTRY(feature2, 2)                                     \
         ".section .discard, \"a\", @progbits\n"                         \
+        ".byte " alt_total_len "\n" /* total_len <= 255 */              \
         DISCARD_ENTRY(1)                                                \
         DISCARD_ENTRY(2)                                                \
         ".section .altinstr_replacement, \"ax\", @progbits\n"           \
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 6/7] x86/alt: Drop explicit padding of origin sites
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (4 preceding siblings ...)
  2018-03-07 15:51 ` [PATCH v3 5/7] x86/alt: Support for automatic padding calculations Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-07 15:51 ` [PATCH v3 7/7] x86/build: Use new .nops directive when available Andrew Cooper
  6 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Now that the alternatives infrastructure can calculate the required padding
automatically, there is no need to hard code it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/x86_64/compat/entry.S  |  2 +-
 xen/arch/x86/x86_64/entry.S         |  2 +-
 xen/include/asm-x86/nops.h          |  7 -------
 xen/include/asm-x86/spec_ctrl_asm.h | 19 ++++++++-----------
 4 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index 84f5eb1..f52bffc 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -134,7 +134,7 @@ ENTRY(compat_restore_all_guest)
         jne   1b
 2:
 .endm
-	ALTERNATIVE_2 ".skip 45, 0x90", \
+	ALTERNATIVE_2 "", \
             alt_cr4_pv32, X86_FEATURE_XEN_SMEP, \
             alt_cr4_pv32, X86_FEATURE_XEN_SMAP
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 207c1e0..3824867 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -601,7 +601,7 @@ handle_exception_saved:
         testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
         jz    exception_with_ints_disabled
 
-        ALTERNATIVE_2 "jmp .Lcr4_pv32_done; .skip 2, 0x90", \
+        ALTERNATIVE_2 "jmp .Lcr4_pv32_done", \
             __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMEP, \
             __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMAP
 
diff --git a/xen/include/asm-x86/nops.h b/xen/include/asm-x86/nops.h
index 61319cc..1a46b97 100644
--- a/xen/include/asm-x86/nops.h
+++ b/xen/include/asm-x86/nops.h
@@ -65,13 +65,6 @@
 #define ASM_NOP8 _ASM_MK_NOP(P6_NOP8)
 #define ASM_NOP9 _ASM_MK_NOP(P6_NOP9)
 
-#define ASM_NOP17 ASM_NOP8; ASM_NOP7; ASM_NOP2
-#define ASM_NOP21 ASM_NOP8; ASM_NOP8; ASM_NOP5
-#define ASM_NOP24 ASM_NOP8; ASM_NOP8; ASM_NOP8
-#define ASM_NOP29 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP5
-#define ASM_NOP32 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8
-#define ASM_NOP40 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8
-
 #define ASM_NOP_MAX 9
 
 #endif /* __X86_ASM_NOPS_H__ */
diff --git a/xen/include/asm-x86/spec_ctrl_asm.h b/xen/include/asm-x86/spec_ctrl_asm.h
index 1f2b6f3..1623fc0 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -216,9 +216,8 @@
 
 /* Use after a VMEXIT from an HVM guest. */
 #define SPEC_CTRL_ENTRY_FROM_VMEXIT                                     \
-    ALTERNATIVE __stringify(ASM_NOP40),                                 \
-        DO_OVERWRITE_RSB, X86_FEATURE_RSB_VMEXIT;                       \
-    ALTERNATIVE_2 __stringify(ASM_NOP32),                               \
+    ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_VMEXIT;           \
+    ALTERNATIVE_2 "",                                                   \
         __stringify(DO_SPEC_CTRL_ENTRY_FROM_VMEXIT                      \
                     ibrs_val=SPEC_CTRL_IBRS),                           \
         X86_FEATURE_XEN_IBRS_SET,                                       \
@@ -228,9 +227,8 @@
 
 /* Use after an entry from PV context (syscall/sysenter/int80/int82/etc). */
 #define SPEC_CTRL_ENTRY_FROM_PV                                         \
-    ALTERNATIVE __stringify(ASM_NOP40),                                 \
-        DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;                       \
-    ALTERNATIVE_2 __stringify(ASM_NOP21),                               \
+    ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;           \
+    ALTERNATIVE_2 "",                                                   \
         __stringify(DO_SPEC_CTRL_ENTRY maybexen=0                       \
                     ibrs_val=SPEC_CTRL_IBRS),                           \
         X86_FEATURE_XEN_IBRS_SET,                                       \
@@ -239,9 +237,8 @@
 
 /* Use in interrupt/exception context.  May interrupt Xen or PV context. */
 #define SPEC_CTRL_ENTRY_FROM_INTR                                       \
-    ALTERNATIVE __stringify(ASM_NOP40),                                 \
-        DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;                       \
-    ALTERNATIVE_2 __stringify(ASM_NOP29),                               \
+    ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;           \
+    ALTERNATIVE_2 "",                                                   \
         __stringify(DO_SPEC_CTRL_ENTRY maybexen=1                       \
                     ibrs_val=SPEC_CTRL_IBRS),                           \
         X86_FEATURE_XEN_IBRS_SET,                                       \
@@ -250,13 +247,13 @@
 
 /* Use when exiting to Xen context. */
 #define SPEC_CTRL_EXIT_TO_XEN                                           \
-    ALTERNATIVE_2 __stringify(ASM_NOP17),                               \
+    ALTERNATIVE_2 "",                                                   \
         DO_SPEC_CTRL_EXIT_TO_XEN, X86_FEATURE_XEN_IBRS_SET,             \
         DO_SPEC_CTRL_EXIT_TO_XEN, X86_FEATURE_XEN_IBRS_CLEAR
 
 /* Use when exiting to guest context. */
 #define SPEC_CTRL_EXIT_TO_GUEST                                         \
-    ALTERNATIVE_2 __stringify(ASM_NOP24),                               \
+    ALTERNATIVE_2 "",                                                   \
         DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_XEN_IBRS_SET,           \
         DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_XEN_IBRS_CLEAR
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 7/7] x86/build: Use new .nops directive when available
  2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (5 preceding siblings ...)
  2018-03-07 15:51 ` [PATCH v3 6/7] x86/alt: Drop explicit padding of origin sites Andrew Cooper
@ 2018-03-07 15:51 ` Andrew Cooper
  2018-03-08 15:53   ` Jan Beulich
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2018-03-07 15:51 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich

Newer versions of binutils are capable of emitting an exact number bytes worth
of optimised nops, which are P6 nops.  Use this in preference to .skip when
available, and skip optimising nops entirely when they correct for the running
hardware.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

v3:
 * Rewrite toolstack P6 nops when K8 nops are the correct ones for the system.
---
 xen/arch/x86/Rules.mk                 |  4 ++++
 xen/arch/x86/alternative.c            |  3 ++-
 xen/include/asm-x86/alternative-asm.h | 12 +++++++++++-
 xen/include/asm-x86/alternative.h     | 13 +++++++++++--
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index ac585a3..c84ed20 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -29,6 +29,10 @@ $(call as-option-add,CFLAGS,CC,"invpcid (%rax)$$(comma)%rax",-DHAVE_AS_INVPCID)
 $(call as-option-add,CFLAGS,CC,\
     ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
 
+# Check to see whether the assmbler supports the .nop directive.
+$(call as-option-add,CFLAGS,CC,\
+    ".L1: .L2: .nops (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
+
 CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
 
 # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 93adf56..dc3ef24 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -207,7 +207,8 @@ void init_or_livepatch apply_alternatives(struct alt_instr *start,
             base->priv = 1;
 
             /* Nothing useful to do? */
-            if ( a->pad_len <= 1 )
+            if ( (TOOLCHAIN_P6_NOPS && ideal_nops == p6_nops) ||
+                 a->pad_len <= 1 )
                 continue;
 
             add_nops(buf, a->pad_len);
diff --git a/xen/include/asm-x86/alternative-asm.h b/xen/include/asm-x86/alternative-asm.h
index 0b61516..0d6fb4b 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_ALTERNATIVE_ASM_H_
 #define _ASM_X86_ALTERNATIVE_ASM_H_
 
+#include <asm/nops.h>
+
 #ifdef __ASSEMBLY__
 
 /*
@@ -19,6 +21,14 @@
     .byte 0 /* priv */
 .endm
 
+.macro mknops nr_bytes
+#ifdef HAVE_AS_NOP_DIRECTIVE
+    .nops \nr_bytes, ASM_NOP_MAX
+#else
+    .skip \nr_bytes, 0x90
+#endif
+.endm
+
 /* GAS's idea of true is -1, while Clang's idea is 1. */
 #ifdef HAVE_AS_NEGATIVE_TRUE
 # define as_true(x) (-(x))
@@ -29,7 +39,7 @@
 #define decl_orig(insn, padding)                  \
  .L\@_orig_s: insn; .L\@_orig_e:                  \
  .L\@_diff = padding;                             \
- .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90;  \
+ mknops (as_true(.L\@_diff > 0) * .L\@_diff);     \
  .L\@_orig_p:
 
 #define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index 4803368..7a88c43 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -2,7 +2,6 @@
 #define __X86_ALTERNATIVE_H__
 
 #include <asm/alternative-asm.h>
-#include <asm/nops.h>
 
 #ifndef __ASSEMBLY__
 #include <xen/stringify.h>
@@ -27,6 +26,16 @@ extern void add_nops(void *insns, unsigned int len);
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
 
+asm ( ".macro mknops nr_bytes\n\t"
+#ifdef HAVE_AS_NOP_DIRECTIVE
+# define TOOLCHAIN_P6_NOPS 1
+      ".nops \\nr_bytes, " __stringify(ASM_NOP_MAX) "\n\t"
+#else
+# define TOOLCHAIN_P6_NOPS 0
+      ".skip \\nr_bytes, 0x90\n\t"
+#endif
+      ".endm\n\t" );
+
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
 #define alt_total_len      "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
@@ -46,7 +55,7 @@ extern void alternative_instructions(void);
 #define OLDINSTR(oldinstr, padding)                              \
     ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
     ".LXEN%=_diff = " padding "\n\t"                             \
-    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+    "mknops ("AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff)\n\t"    \
     ".LXEN%=_orig_p:\n\t"
 
 #define OLDINSTR_1(oldinstr, n1)                                 \
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/7] x86/alt: Support for automatic padding calculations
  2018-03-07 15:51 ` [PATCH v3 5/7] x86/alt: Support for automatic padding calculations Andrew Cooper
@ 2018-03-08 15:42   ` Jan Beulich
  2018-03-08 16:23     ` Andrew Cooper
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-03-08 15:42 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
> @@ -175,18 +175,47 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
>       * So be careful if you want to change the scan order to any other
>       * order.
>       */
> -    for ( a = start; a < end; a++ )
> +    for ( a = base = start; a < end; a++ )
>      {
>          uint8_t *orig = ALT_ORIG_PTR(a);
>          uint8_t *repl = ALT_REPL_PTR(a);
>          uint8_t buf[MAX_PATCH_LEN];
> +        unsigned int total_len = a->orig_len + a->pad_len;
>  
> -        BUG_ON(a->repl_len > a->orig_len);
> -        BUG_ON(a->orig_len > sizeof(buf));
> +        BUG_ON(a->repl_len > total_len);
> +        BUG_ON(total_len > sizeof(buf));
>          BUG_ON(a->cpuid >= NCAPINTS * 32);
>  
> +        /*
> +         * Detect sequences of alt_instr's patching the same origin site, and
> +         * keep base pointing at the first alt_instr entry.  This is so we can
> +         * refer to a single ->priv field for patching decisions.
> +         *
> +         * ->priv being nonzero means that the origin site has already been
> +         * modified, and we shouldn't try to optimise the nops again.
> +         */
> +        if ( ALT_ORIG_PTR(base) != orig )
> +            base = a;

I don't understand why you need the new "priv" field - have a
boolean local variable which you reset instead of base here, and
which you check/set instead of base->priv below.

Everything else looks fine to me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 7/7] x86/build: Use new .nops directive when available
  2018-03-07 15:51 ` [PATCH v3 7/7] x86/build: Use new .nops directive when available Andrew Cooper
@ 2018-03-08 15:53   ` Jan Beulich
  2018-03-08 16:48     ` Andrew Cooper
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-03-08 15:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -29,6 +29,10 @@ $(call as-option-add,CFLAGS,CC,"invpcid 
> (%rax)$$(comma)%rax",-DHAVE_AS_INVPCID)
>  $(call as-option-add,CFLAGS,CC,\
>      ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
>  
> +# Check to see whether the assmbler supports the .nop directive.
> +$(call as-option-add,CFLAGS,CC,\
> +    ".L1: .L2: .nops (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)

The construct is right, but the comment still has the old directive name.

> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -207,7 +207,8 @@ void init_or_livepatch apply_alternatives(struct alt_instr *start,
>              base->priv = 1;
>  
>              /* Nothing useful to do? */
> -            if ( a->pad_len <= 1 )
> +            if ( (TOOLCHAIN_P6_NOPS && ideal_nops == p6_nops) ||
> +                 a->pad_len <= 1 )
>                  continue;

I'm sorry, but no - we can't assume all gas versions going forward
will continue to produce the NOPs we want. They may change at
any time, and we may change our mind at any time. If anything
you'd need to actively check that what their .nops produces
matches what our table has.

Additionally such skipping on the vast majority of hardware is
prone to leave bugs undiscovered for quite some time.

Anyway - I continue to fail to see the value of this patch with the
immediately preceding one already doing all we need.

An alternative might be to have a Kconfig option to suppress the
NOP optimization altogether, and rely on what the tool chain
produces.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/7] x86/alt: Support for automatic padding calculations
  2018-03-08 15:42   ` Jan Beulich
@ 2018-03-08 16:23     ` Andrew Cooper
  2018-03-08 16:49       ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2018-03-08 16:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

On 08/03/18 15:42, Jan Beulich wrote:
>>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
>> @@ -175,18 +175,47 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
>>       * So be careful if you want to change the scan order to any other
>>       * order.
>>       */
>> -    for ( a = start; a < end; a++ )
>> +    for ( a = base = start; a < end; a++ )
>>      {
>>          uint8_t *orig = ALT_ORIG_PTR(a);
>>          uint8_t *repl = ALT_REPL_PTR(a);
>>          uint8_t buf[MAX_PATCH_LEN];
>> +        unsigned int total_len = a->orig_len + a->pad_len;
>>  
>> -        BUG_ON(a->repl_len > a->orig_len);
>> -        BUG_ON(a->orig_len > sizeof(buf));
>> +        BUG_ON(a->repl_len > total_len);
>> +        BUG_ON(total_len > sizeof(buf));
>>          BUG_ON(a->cpuid >= NCAPINTS * 32);
>>  
>> +        /*
>> +         * Detect sequences of alt_instr's patching the same origin site, and
>> +         * keep base pointing at the first alt_instr entry.  This is so we can
>> +         * refer to a single ->priv field for patching decisions.
>> +         *
>> +         * ->priv being nonzero means that the origin site has already been
>> +         * modified, and we shouldn't try to optimise the nops again.
>> +         */
>> +        if ( ALT_ORIG_PTR(base) != orig )
>> +            base = a;
> I don't understand why you need the new "priv" field - have a
> boolean local variable which you reset instead of base here, and
> which you check/set instead of base->priv below.

That can break in a "corrupted instruction stream" kind of way if we
perform two passes over the same set of alt_instr's, e.g. after loading
microcode, finding some new features, and rerunning alternatives.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 7/7] x86/build: Use new .nops directive when available
  2018-03-08 15:53   ` Jan Beulich
@ 2018-03-08 16:48     ` Andrew Cooper
  2018-03-08 16:56       ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2018-03-08 16:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

On 08/03/18 15:53, Jan Beulich wrote:
>>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/Rules.mk
>> +++ b/xen/arch/x86/Rules.mk
>> @@ -29,6 +29,10 @@ $(call as-option-add,CFLAGS,CC,"invpcid 
>> (%rax)$$(comma)%rax",-DHAVE_AS_INVPCID)
>>  $(call as-option-add,CFLAGS,CC,\
>>      ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
>>  
>> +# Check to see whether the assmbler supports the .nop directive.
>> +$(call as-option-add,CFLAGS,CC,\
>> +    ".L1: .L2: .nops (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
> The construct is right, but the comment still has the old directive name.
>
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -207,7 +207,8 @@ void init_or_livepatch apply_alternatives(struct alt_instr *start,
>>              base->priv = 1;
>>  
>>              /* Nothing useful to do? */
>> -            if ( a->pad_len <= 1 )
>> +            if ( (TOOLCHAIN_P6_NOPS && ideal_nops == p6_nops) ||
>> +                 a->pad_len <= 1 )
>>                  continue;
> I'm sorry, but no - we can't assume all gas versions going forward
> will continue to produce the NOPs we want. They may change at
> any time, and we may change our mind at any time. If anything
> you'd need to actively check that what their .nops produces
> matches what our table has.

Hmm perhaps, but the chances of either of these actually happening are
extremely low.

> Additionally such skipping on the vast majority of hardware is
> prone to leave bugs undiscovered for quite some time.

Such as?  This statement isn't exactly helpful, and

> Anyway - I continue to fail to see the value of this patch with the
> immediately preceding one already doing all we need.

The purpose, as explained before, is to avoid patching whenever possible.

"Whenever possible" is every time we fail a feature check, and the
toolchain puts out the correct nops (which, for a capable toolchain, is
overwhelmingly likely given our 64bit-ness), and has a direct effect on
our boot time safety.

> An alternative might be to have a Kconfig option to suppress the
> NOP optimization altogether, and rely on what the tool chain
> produces.

How and why would a user wish to change this option?  I don't think
anyone would find it useful.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/7] x86/alt: Support for automatic padding calculations
  2018-03-08 16:23     ` Andrew Cooper
@ 2018-03-08 16:49       ` Jan Beulich
  2018-03-08 16:52         ` Andrew Cooper
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-03-08 16:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

>>> On 08.03.18 at 17:23, <andrew.cooper3@citrix.com> wrote:
> On 08/03/18 15:42, Jan Beulich wrote:
>>>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
>>> @@ -175,18 +175,47 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
>>>       * So be careful if you want to change the scan order to any other
>>>       * order.
>>>       */
>>> -    for ( a = start; a < end; a++ )
>>> +    for ( a = base = start; a < end; a++ )
>>>      {
>>>          uint8_t *orig = ALT_ORIG_PTR(a);
>>>          uint8_t *repl = ALT_REPL_PTR(a);
>>>          uint8_t buf[MAX_PATCH_LEN];
>>> +        unsigned int total_len = a->orig_len + a->pad_len;
>>>  
>>> -        BUG_ON(a->repl_len > a->orig_len);
>>> -        BUG_ON(a->orig_len > sizeof(buf));
>>> +        BUG_ON(a->repl_len > total_len);
>>> +        BUG_ON(total_len > sizeof(buf));
>>>          BUG_ON(a->cpuid >= NCAPINTS * 32);
>>>  
>>> +        /*
>>> +         * Detect sequences of alt_instr's patching the same origin site, and
>>> +         * keep base pointing at the first alt_instr entry.  This is so we can
>>> +         * refer to a single ->priv field for patching decisions.
>>> +         *
>>> +         * ->priv being nonzero means that the origin site has already been
>>> +         * modified, and we shouldn't try to optimise the nops again.
>>> +         */
>>> +        if ( ALT_ORIG_PTR(base) != orig )
>>> +            base = a;
>> I don't understand why you need the new "priv" field - have a
>> boolean local variable which you reset instead of base here, and
>> which you check/set instead of base->priv below.
> 
> That can break in a "corrupted instruction stream" kind of way if we
> perform two passes over the same set of alt_instr's, e.g. after loading
> microcode, finding some new features, and rerunning alternatives.

Well, we don't do anything like that today (which first of all would
require all that stuff to come out of .init.*). But yes, if you want
the code be prepared for that, fine with me:

Reviewed-by: Jan Beulich <jbeulich@suse.com>

But it would be nice if you called out that extra aspect in the
description.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/7] x86/alt: Support for automatic padding calculations
  2018-03-08 16:49       ` Jan Beulich
@ 2018-03-08 16:52         ` Andrew Cooper
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-03-08 16:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

On 08/03/18 16:49, Jan Beulich wrote:
>>>> On 08.03.18 at 17:23, <andrew.cooper3@citrix.com> wrote:
>> On 08/03/18 15:42, Jan Beulich wrote:
>>>>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
>>>> @@ -175,18 +175,47 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
>>>>       * So be careful if you want to change the scan order to any other
>>>>       * order.
>>>>       */
>>>> -    for ( a = start; a < end; a++ )
>>>> +    for ( a = base = start; a < end; a++ )
>>>>      {
>>>>          uint8_t *orig = ALT_ORIG_PTR(a);
>>>>          uint8_t *repl = ALT_REPL_PTR(a);
>>>>          uint8_t buf[MAX_PATCH_LEN];
>>>> +        unsigned int total_len = a->orig_len + a->pad_len;
>>>>  
>>>> -        BUG_ON(a->repl_len > a->orig_len);
>>>> -        BUG_ON(a->orig_len > sizeof(buf));
>>>> +        BUG_ON(a->repl_len > total_len);
>>>> +        BUG_ON(total_len > sizeof(buf));
>>>>          BUG_ON(a->cpuid >= NCAPINTS * 32);
>>>>  
>>>> +        /*
>>>> +         * Detect sequences of alt_instr's patching the same origin site, and
>>>> +         * keep base pointing at the first alt_instr entry.  This is so we can
>>>> +         * refer to a single ->priv field for patching decisions.
>>>> +         *
>>>> +         * ->priv being nonzero means that the origin site has already been
>>>> +         * modified, and we shouldn't try to optimise the nops again.
>>>> +         */
>>>> +        if ( ALT_ORIG_PTR(base) != orig )
>>>> +            base = a;
>>> I don't understand why you need the new "priv" field - have a
>>> boolean local variable which you reset instead of base here, and
>>> which you check/set instead of base->priv below.
>> That can break in a "corrupted instruction stream" kind of way if we
>> perform two passes over the same set of alt_instr's, e.g. after loading
>> microcode, finding some new features, and rerunning alternatives.
> Well, we don't do anything like that today (which first of all would
> require all that stuff to come out of .init.*). But yes, if you want
> the code be prepared for that, fine with me:
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> But it would be nice if you called out that extra aspect in the
> description.

Sure - I'll update this comment in the code.

FWIW, I chose this specific example because alternatives handling for
livepatching Spectre mitigations was a particularly sore area for some
people.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 7/7] x86/build: Use new .nops directive when available
  2018-03-08 16:48     ` Andrew Cooper
@ 2018-03-08 16:56       ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-03-08 16:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Roger Pau Monné

>>> On 08.03.18 at 17:48, <andrew.cooper3@citrix.com> wrote:
> On 08/03/18 15:53, Jan Beulich wrote:
>>>>> On 07.03.18 at 16:51, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/Rules.mk
>>> +++ b/xen/arch/x86/Rules.mk
>>> @@ -29,6 +29,10 @@ $(call as-option-add,CFLAGS,CC,"invpcid 
>>> (%rax)$$(comma)%rax",-DHAVE_AS_INVPCID)
>>>  $(call as-option-add,CFLAGS,CC,\
>>>      ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
>>>  
>>> +# Check to see whether the assmbler supports the .nop directive.
>>> +$(call as-option-add,CFLAGS,CC,\
>>> +    ".L1: .L2: .nops (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
>> The construct is right, but the comment still has the old directive name.
>>
>>> --- a/xen/arch/x86/alternative.c
>>> +++ b/xen/arch/x86/alternative.c
>>> @@ -207,7 +207,8 @@ void init_or_livepatch apply_alternatives(struct alt_instr *start,
>>>              base->priv = 1;
>>>  
>>>              /* Nothing useful to do? */
>>> -            if ( a->pad_len <= 1 )
>>> +            if ( (TOOLCHAIN_P6_NOPS && ideal_nops == p6_nops) ||
>>> +                 a->pad_len <= 1 )
>>>                  continue;
>> I'm sorry, but no - we can't assume all gas versions going forward
>> will continue to produce the NOPs we want. They may change at
>> any time, and we may change our mind at any time. If anything
>> you'd need to actively check that what their .nops produces
>> matches what our table has.
> 
> Hmm perhaps, but the chances of either of these actually happening are
> extremely low.
> 
>> Additionally such skipping on the vast majority of hardware is
>> prone to leave bugs undiscovered for quite some time.
> 
> Such as?  This statement isn't exactly helpful, and

Well, the code past the "continue" will effectively only get
compile tested for extended periods of time. Whatever bug it
is that might creep in there. (But then again your sentence
looks unfinished.)

>> Anyway - I continue to fail to see the value of this patch with the
>> immediately preceding one already doing all we need.
> 
> The purpose, as explained before, is to avoid patching whenever possible.
> 
> "Whenever possible" is every time we fail a feature check, and the
> toolchain puts out the correct nops (which, for a capable toolchain, is
> overwhelmingly likely given our 64bit-ness), and has a direct effect on
> our boot time safety.

If our patching has a bug, we'll have to fix it. Whereas (as said
before) the view on what is "correct" may vary over time.

>> An alternative might be to have a Kconfig option to suppress the
>> NOP optimization altogether, and rely on what the tool chain
>> produces.
> 
> How and why would a user wish to change this option?  I don't think
> anyone would find it useful.

If you find such an option useless, then why is this patch useful? I,
for one, would leave the option off, making sure runtime NOP
replacement happens at all times.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-03-08 16:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-07 15:51 [PATCH v3 0/7] x86/alternatives: Support for automatic padding calculations Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 1/7] x86/alt: Drop unused alternative infrastructure Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 2/7] x86/alt: Clean up struct alt_instr and its users Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 3/7] x86/alt: Clean up the assembly used to generate alternatives Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 4/7] x86/asm: Remove opencoded uses of altinstruction_entry Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 5/7] x86/alt: Support for automatic padding calculations Andrew Cooper
2018-03-08 15:42   ` Jan Beulich
2018-03-08 16:23     ` Andrew Cooper
2018-03-08 16:49       ` Jan Beulich
2018-03-08 16:52         ` Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 6/7] x86/alt: Drop explicit padding of origin sites Andrew Cooper
2018-03-07 15:51 ` [PATCH v3 7/7] x86/build: Use new .nops directive when available Andrew Cooper
2018-03-08 15:53   ` Jan Beulich
2018-03-08 16:48     ` Andrew Cooper
2018-03-08 16:56       ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).