xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86/alternatives: Support for automatic padding calculations
@ 2018-02-26 11:24 Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop unused alternative infrastructure Andrew Cooper
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This is the end result of a lot of work I started during the Spectre/Meltdown
embargo window, and deferred because it was taking too long.  It finally
resolves the explict padding calculations for the SPEC_CTRL alternatives.

This series depends on Roger's "x86/clang: allow integrated assembler usage"
and has been tested with a mainline version of clang.

http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/alternatives-v2

Andrew Cooper (7):
  x86/alt: Drop unused alternative infrastructure
  x86/alt: Clean up struct alt_instr and its users
  x86/alt: Clean up the assembly used to generate alternatives
  x86/asm: Remove opencoded uses of altinstruction_entry
  x86/alt: Support for automatic padding calculations
  x86/alt: Drop explicit padding of origin sites
  x86/build: Use new .nop directive when available

 xen/arch/x86/Rules.mk                 |   8 ++
 xen/arch/x86/alternative.c            |  48 ++++++++---
 xen/arch/x86/x86_64/compat/entry.S    |  26 +++---
 xen/arch/x86/x86_64/entry.S           |  20 +----
 xen/include/asm-x86/alternative-asm.h | 113 +++++++++++++++++++-------
 xen/include/asm-x86/alternative.h     | 148 +++++++++++++++++++---------------
 xen/include/asm-x86/asm_defns.h       |  32 +++-----
 xen/include/asm-x86/nops.h            |   7 --
 xen/include/asm-x86/spec_ctrl_asm.h   |  19 ++---
 9 files changed, 244 insertions(+), 177 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/alt: Drop unused alternative infrastructure
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Clean up struct alt_instr and its users Andrew Cooper
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

ALTERNATIVE_3 is more complicated than ALTERNATIVE_2 when it comes to
calculating extra padding length, and we have no need for the complexity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2:
 * Retain ASM_OUTPUT2()
---
 xen/include/asm-x86/alternative.h | 22 ----------------------
 1 file changed, 22 deletions(-)

diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index ba537d6..325a29f 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -65,11 +65,6 @@ extern void alternative_instructions(void);
 	ALTERNATIVE(oldinstr, newinstr1, feature1)			  \
 	ALTERNATIVE_N(newinstr2, feature2, 2)
 
-#define ALTERNATIVE_3(oldinstr, newinstr1, feature1, newinstr2, feature2, \
-		      newinstr3, feature3)				  \
-	ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-	ALTERNATIVE_N(newinstr3, feature3, 3)
-
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -118,23 +113,6 @@ extern void alternative_instructions(void);
 				   newinstr2, feature2)			\
 		     : output : input)
 
-/*
- * This is similar to alternative_io. But it has three features and
- * respective instructions.
- *
- * If CPU has feature3, newinstr3 is used.
- * Otherwise, if CPU has feature2, newinstr2 is used.
- * Otherwise, if CPU has feature1, newinstr1 is used.
- * Otherwise, oldinstr is used.
- */
-#define alternative_io_3(oldinstr, newinstr1, feature1, newinstr2,	\
-			 feature2, newinstr3, feature3, output,		\
-			 input...)					\
-	asm volatile(ALTERNATIVE_3(oldinstr, newinstr1, feature1,	\
-				   newinstr2, feature2, newinstr3,	\
-				   feature3)				\
-		     : output : input)
-
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/alt: Clean up struct alt_instr and its users
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop unused alternative infrastructure Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Clean up the assembly used to generate alternatives Andrew Cooper
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

 * Rename some fields for consistency and clarity, and use standard types.
 * Don't opencode the use of ALT_{ORIG,REPL}_PTR().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2:
 * Change more types to standard ones
---
 xen/arch/x86/alternative.c        | 24 ++++++++++++------------
 xen/include/asm-x86/alternative.h | 14 +++++++-------
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 5c8b6f6..51ca53e 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -163,8 +163,6 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
                                           const struct alt_instr *end)
 {
     const struct alt_instr *a;
-    u8 *instr, *replacement;
-    u8 insnbuf[MAX_PATCH_LEN];
 
     printk(KERN_INFO "alt table %p -> %p\n", start, end);
 
@@ -179,23 +177,25 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
      */
     for ( a = start; a < end; a++ )
     {
-        instr = (u8 *)&a->instr_offset + a->instr_offset;
-        replacement = (u8 *)&a->repl_offset + a->repl_offset;
-        BUG_ON(a->replacementlen > a->instrlen);
-        BUG_ON(a->instrlen > sizeof(insnbuf));
+        uint8_t *orig = ALT_ORIG_PTR(a);
+        uint8_t *repl = ALT_REPL_PTR(a);
+        uint8_t buf[MAX_PATCH_LEN];
+
+        BUG_ON(a->repl_len > a->orig_len);
+        BUG_ON(a->orig_len > sizeof(buf));
         BUG_ON(a->cpuid >= NCAPINTS * 32);
+
         if ( !boot_cpu_has(a->cpuid) )
             continue;
 
-        memcpy(insnbuf, replacement, a->replacementlen);
+        memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
-        if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 )
-            *(s32 *)(insnbuf + 1) += replacement - instr;
+        if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
+            *(int32_t *)(buf + 1) += repl - orig;
 
-        add_nops(insnbuf + a->replacementlen,
-                 a->instrlen - a->replacementlen);
-        text_poke(instr, insnbuf, a->instrlen);
+        add_nops(buf + a->repl_len, a->orig_len - a->repl_len);
+        text_poke(orig, buf, a->orig_len);
     }
 }
 
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index 325a29f..d9706fd 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -9,15 +9,15 @@
 #include <xen/types.h>
 
 struct alt_instr {
-    s32 instr_offset;       /* original instruction */
-    s32 repl_offset;        /* offset to replacement instruction */
-    u16 cpuid;              /* cpuid bit set for replacement */
-    u8  instrlen;           /* length of original instruction */
-    u8  replacementlen;     /* length of new instruction, <= instrlen */
+    int32_t  orig_offset;   /* original instruction */
+    int32_t  repl_offset;   /* offset to replacement instruction */
+    uint16_t cpuid;         /* cpuid bit set for replacement */
+    uint8_t  orig_len;      /* length of original instruction */
+    uint8_t  repl_len;      /* length of new instruction, <= instrlen */
 };
 
-#define __ALT_PTR(a,f)      ((u8 *)((void *)&(a)->f + (a)->f))
-#define ALT_ORIG_PTR(a)     __ALT_PTR(a, instr_offset)
+#define __ALT_PTR(a,f)      ((uint8_t *)((void *)&(a)->f + (a)->f))
+#define ALT_ORIG_PTR(a)     __ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a)     __ALT_PTR(a, repl_offset)
 
 extern void add_nops(void *insns, unsigned int len);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/alt: Clean up the assembly used to generate alternatives
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop unused alternative infrastructure Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Clean up struct alt_instr and its users Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/asm: Remove opencoded uses of altinstruction_entry Andrew Cooper
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich

 * On the C side, switch to using local lables rather than hardcoded numbers.
 * Rename parameters and lables to be consistent with alt_instr names, and
   consistent between the the C and asm versions.
 * On the asm side, factor some expressions out into macros to aid clarity.
 * Consistently declare section attributes.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

v2:
 * repl_{s,e} => alt_repl_{s,e}
 * Use .LXEN%= prefix for the C lables
---
 xen/include/asm-x86/alternative-asm.h | 57 ++++++++++++++++--------------
 xen/include/asm-x86/alternative.h     | 65 +++++++++++++++++++----------------
 2 files changed, 67 insertions(+), 55 deletions(-)

diff --git a/xen/include/asm-x86/alternative-asm.h b/xen/include/asm-x86/alternative-asm.h
index 6640e85..150bd1a 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -9,60 +9,67 @@
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig alt feature orig_len alt_len
+.macro altinstruction_entry orig repl feature orig_len repl_len
     .long \orig - .
-    .long \alt - .
+    .long \repl - .
     .word \feature
     .byte \orig_len
-    .byte \alt_len
+    .byte \repl_len
 .endm
 
+#define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
+#define repl_len(nr)           (.L\@_repl_e\()nr  -     .L\@_repl_s\()nr)
+#define decl_repl(insn, nr)     .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:
+
 .macro ALTERNATIVE oldinstr, newinstr, feature
-.Lold_start_\@:
+.L\@_orig_s:
     \oldinstr
-.Lold_end_\@:
+.L\@_orig_e:
 
     .pushsection .altinstructions, "a", @progbits
-    altinstruction_entry .Lold_start_\@, .Lnew_start_\@, \feature, \
-        (.Lold_end_\@ - .Lold_start_\@), (.Lnew_end_\@ - .Lnew_start_\@)
+    altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature, \
+        orig_len, repl_len(1)
 
     .section .discard, "a", @progbits
     /* Assembler-time check that \newinstr isn't longer than \oldinstr. */
-    .byte 0xff + (.Lnew_end_\@ - .Lnew_start_\@) - (.Lold_end_\@ - .Lold_start_\@)
+    .byte 0xff + repl_len(1) - orig_len
 
     .section .altinstr_replacement, "ax", @progbits
-.Lnew_start_\@:
-    \newinstr
-.Lnew_end_\@:
+
+    decl_repl(\newinstr, 1)
+
     .popsection
 .endm
 
 .macro ALTERNATIVE_2 oldinstr, newinstr1, feature1, newinstr2, feature2
-.Lold_start_\@:
+.L\@_orig_s:
     \oldinstr
-.Lold_end_\@:
+.L\@_orig_e:
 
     .pushsection .altinstructions, "a", @progbits
-    altinstruction_entry .Lold_start_\@, .Lnew1_start_\@, \feature1, \
-        (.Lold_end_\@ - .Lold_start_\@), (.Lnew1_end_\@ - .Lnew1_start_\@)
-    altinstruction_entry .Lold_start_\@, .Lnew2_start_\@, \feature2, \
-        (.Lold_end_\@ - .Lold_start_\@), (.Lnew2_end_\@ - .Lnew2_start_\@)
+
+    altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature1, \
+        orig_len, repl_len(1)
+    altinstruction_entry .L\@_orig_s, .L\@_repl_s2, \feature2, \
+        orig_len, repl_len(2)
 
     .section .discard, "a", @progbits
     /* Assembler-time check that \newinstr{1,2} aren't longer than \oldinstr. */
-    .byte 0xff + (.Lnew1_end_\@ - .Lnew1_start_\@) - (.Lold_end_\@ - .Lold_start_\@)
-    .byte 0xff + (.Lnew2_end_\@ - .Lnew2_start_\@) - (.Lold_end_\@ - .Lold_start_\@)
+    .byte 0xff + repl_len(1) - orig_len
+    .byte 0xff + repl_len(2) - orig_len
 
     .section .altinstr_replacement, "ax", @progbits
-.Lnew1_start_\@:
-    \newinstr1
-.Lnew1_end_\@:
-.Lnew2_start_\@:
-    \newinstr2
-.Lnew2_end_\@:
+
+    decl_repl(\newinstr1, 1)
+    decl_repl(\newinstr2, 2)
+
     .popsection
 .endm
 
+#undef decl_repl
+#undef repl_len
+#undef orig_len
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_ALTERNATIVE_ASM_H_ */
 
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index d9706fd..bcad3ee 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -26,44 +26,49 @@ extern void apply_alternatives(const struct alt_instr *start,
                                const struct alt_instr *end);
 extern void alternative_instructions(void);
 
-#define OLDINSTR(oldinstr)      "661:\n\t" oldinstr "\n662:\n"
+#define OLDINSTR(oldinstr) ".LXEN%=_orig_s:\n\t" oldinstr "\n.LXEN%=_orig_e:\n"
 
-#define b_replacement(number)   "663"#number
-#define e_replacement(number)   "664"#number
+#define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
+#define alt_repl_s(num)    ".LXEN%=_repl_s"#num
+#define alt_repl_e(num)    ".LXEN%=_repl_e"#num
+#define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
 
-#define alt_slen "662b-661b"
-#define alt_rlen(number) e_replacement(number)"f-"b_replacement(number)"f"
+#define ALTINSTR_ENTRY(feature, num)                                    \
+        " .long .LXEN%=_orig_s - .\n"             /* label           */ \
+        " .long " alt_repl_s(num)" - .\n"         /* new instruction */ \
+        " .word " __stringify(feature) "\n"       /* feature bit     */ \
+        " .byte " alt_orig_len "\n"               /* source len      */ \
+        " .byte " alt_repl_len(num) "\n"          /* replacement len */
 
-#define ALTINSTR_ENTRY(feature, number)                                       \
-        " .long 661b - .\n"                             /* label           */ \
-        " .long " b_replacement(number)"f - .\n"        /* new instruction */ \
-        " .word " __stringify(feature) "\n"             /* feature bit     */ \
-        " .byte " alt_slen "\n"                         /* source len      */ \
-        " .byte " alt_rlen(number) "\n"                 /* replacement len */
+#define DISCARD_ENTRY(num)                        /* repl <= orig */    \
+        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_orig_len ")\n"
 
-#define DISCARD_ENTRY(number)                           /* rlen <= slen */    \
-        " .byte 0xff + (" alt_rlen(number) ") - (" alt_slen ")\n"
-
-#define ALTINSTR_REPLACEMENT(newinstr, feature, number) /* replacement */     \
-        b_replacement(number)":\n\t" newinstr "\n" e_replacement(number) ":\n\t"
-
-#define ALTERNATIVE_N(newinstr, feature, number)	\
-	".pushsection .altinstructions,\"a\"\n"		\
-	ALTINSTR_ENTRY(feature, number)			\
-	".section .discard,\"a\",@progbits\n"		\
-	DISCARD_ENTRY(number)				\
-	".section .altinstr_replacement, \"ax\"\n"	\
-	ALTINSTR_REPLACEMENT(newinstr, feature, number)	\
-	".popsection\n"
+#define ALTINSTR_REPLACEMENT(newinstr, num)       /* replacement */     \
+        alt_repl_s(num)":\n\t" newinstr "\n" alt_repl_e(num) ":\n\t"
 
 /* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, feature)			  \
-	OLDINSTR(oldinstr)						  \
-	ALTERNATIVE_N(newinstr, feature, 1)
+#define ALTERNATIVE(oldinstr, newinstr, feature)                        \
+        OLDINSTR(oldinstr)                                              \
+        ".pushsection .altinstructions, \"a\", @progbits\n"             \
+        ALTINSTR_ENTRY(feature, 1)                                      \
+        ".section .discard, \"a\", @progbits\n"                         \
+        DISCARD_ENTRY(1)                                                \
+        ".section .altinstr_replacement, \"ax\", @progbits\n"           \
+        ALTINSTR_REPLACEMENT(newinstr, 1)                               \
+        ".popsection\n"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-	ALTERNATIVE(oldinstr, newinstr1, feature1)			  \
-	ALTERNATIVE_N(newinstr2, feature2, 2)
+        OLDINSTR(oldinstr)                                              \
+        ".pushsection .altinstructions, \"a\", @progbits\n"             \
+        ALTINSTR_ENTRY(feature1, 1)                                     \
+        ALTINSTR_ENTRY(feature2, 2)                                     \
+        ".section .discard, \"a\", @progbits\n"                         \
+        DISCARD_ENTRY(1)                                                \
+        DISCARD_ENTRY(2)                                                \
+        ".section .altinstr_replacement, \"ax\", @progbits\n"           \
+        ALTINSTR_REPLACEMENT(newinstr1, 1)                              \
+        ALTINSTR_REPLACEMENT(newinstr2, 2)                              \
+        ".popsection\n"
 
 /*
  * Alternative instructions for different CPU types or capabilities.
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/asm: Remove opencoded uses of altinstruction_entry
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (2 preceding siblings ...)
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Clean up the assembly used to generate alternatives Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Support for automatic padding calculations Andrew Cooper
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

With future changes, altinstruction_entry is going to become more complicated
to use.  Furthermore, there are already ALTERNATIVE* macros which can be used
to avoid opencoding the creation of replacement information.

For ASM_STAC, ASM_CLAC and CR4_PV32_RESTORE, this means the removal of all
hardocded label numbers.  For the cr4_pv32 alternatives, this means hardcoding
the extra space required in the original patch site, but the hardcoding will
be removed by a later patch.

No change to any functionality, but the handling of nops inside the original
patch sites are a bit different.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/x86_64/compat/entry.S | 26 +++++++++-----------------
 xen/arch/x86/x86_64/entry.S        | 20 +++-----------------
 xen/include/asm-x86/asm_defns.h    | 32 +++++++++++---------------------
 3 files changed, 23 insertions(+), 55 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index 458d810..8aba269 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -111,13 +111,10 @@ ENTRY(compat_restore_all_guest)
         ASSERT_INTERRUPTS_DISABLED
         mov   $~(X86_EFLAGS_IOPL|X86_EFLAGS_NT|X86_EFLAGS_VM),%r11d
         and   UREGS_eflags(%rsp),%r11d
-.Lcr4_orig:
-        .skip .Lcr4_alt_end - .Lcr4_alt, 0x90
-.Lcr4_orig_end:
-        .pushsection .altinstr_replacement, "ax"
-.Lcr4_alt:
+
+.macro alt_cr4_pv32
         testb $3,UREGS_cs(%rsp)
-        jpe   .Lcr4_alt_end
+        jpe   2f
         mov   CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp), %rax
         and   $~XEN_CR4_PV32_BITS, %rax
 1:
@@ -135,17 +132,12 @@ ENTRY(compat_restore_all_guest)
          */
         cmp   %rax, CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp)
         jne   1b
-.Lcr4_alt_end:
-        .section .altinstructions, "a"
-        altinstruction_entry .Lcr4_orig, .Lcr4_orig, X86_FEATURE_ALWAYS, \
-                             (.Lcr4_orig_end - .Lcr4_orig), 0
-        altinstruction_entry .Lcr4_orig, .Lcr4_alt, X86_FEATURE_XEN_SMEP, \
-                             (.Lcr4_orig_end - .Lcr4_orig), \
-                             (.Lcr4_alt_end - .Lcr4_alt)
-        altinstruction_entry .Lcr4_orig, .Lcr4_alt, X86_FEATURE_XEN_SMAP, \
-                             (.Lcr4_orig_end - .Lcr4_orig), \
-                             (.Lcr4_alt_end - .Lcr4_alt)
-        .popsection
+2:
+.endm
+	ALTERNATIVE_2 ".skip 45, 0x90", \
+            alt_cr4_pv32, X86_FEATURE_XEN_SMEP, \
+            alt_cr4_pv32, X86_FEATURE_XEN_SMAP
+
         or    $X86_EFLAGS_IF,%r11
         mov   %r11d,UREGS_eflags(%rsp)
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 941f06f..e939f20 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -564,23 +564,9 @@ handle_exception_saved:
         testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
         jz    exception_with_ints_disabled
 
-.Lcr4_pv32_orig:
-        jmp   .Lcr4_pv32_done
-        .skip (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt) - (. - .Lcr4_pv32_orig), 0xcc
-        .pushsection .altinstr_replacement, "ax"
-.Lcr4_pv32_alt:
-        mov   VCPU_domain(%rbx),%rax
-.Lcr4_pv32_alt_end:
-        .section .altinstructions, "a"
-        altinstruction_entry .Lcr4_pv32_orig, .Lcr4_pv32_alt, \
-                             X86_FEATURE_XEN_SMEP, \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt), \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt)
-        altinstruction_entry .Lcr4_pv32_orig, .Lcr4_pv32_alt, \
-                             X86_FEATURE_XEN_SMAP, \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt), \
-                             (.Lcr4_pv32_alt_end - .Lcr4_pv32_alt)
-        .popsection
+        ALTERNATIVE_2 "jmp .Lcr4_pv32_done; .skip 2, 0x90", \
+            __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMEP, \
+            __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMAP
 
         testb $3,UREGS_cs(%rsp)
         jz    .Lcr4_pv32_done
diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index ebd2c88..a484265 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -195,18 +195,13 @@ void ret_from_intr(void);
 #define __ASM_STAC      .byte 0x0f,0x01,0xcb
 
 #ifdef __ASSEMBLY__
-#define ASM_AC(op)                                                     \
-        661: ASM_NOP3;                                                 \
-        .pushsection .altinstr_replacement, "ax";                      \
-        662: __ASM_##op;                                               \
-        .popsection;                                                   \
-        .pushsection .altinstructions, "a";                            \
-        altinstruction_entry 661b, 661b, X86_FEATURE_ALWAYS, 3, 0;     \
-        altinstruction_entry 661b, 662b, X86_FEATURE_XEN_SMAP, 3, 3;       \
-        .popsection
-
-#define ASM_STAC ASM_AC(STAC)
-#define ASM_CLAC ASM_AC(CLAC)
+#define ASM_STAC                                        \
+    ALTERNATIVE __stringify(ASM_NOP3),                  \
+        __stringify(__ASM_STAC), X86_FEATURE_XEN_SMAP
+
+#define ASM_CLAC                                        \
+    ALTERNATIVE __stringify(ASM_NOP3),                  \
+        __stringify(__ASM_CLAC), X86_FEATURE_XEN_SMAP
 
 .macro write_cr3 val:req, tmp1:req, tmp2:req
         mov   %cr4, %\tmp1
@@ -217,15 +212,10 @@ void ret_from_intr(void);
         mov   %\tmp2, %cr4
 .endm
 
-#define CR4_PV32_RESTORE                                           \
-        667: ASM_NOP5;                                             \
-        .pushsection .altinstr_replacement, "ax";                  \
-        668: call cr4_pv32_restore;                                \
-        .section .altinstructions, "a";                            \
-        altinstruction_entry 667b, 667b, X86_FEATURE_ALWAYS, 5, 0; \
-        altinstruction_entry 667b, 668b, X86_FEATURE_XEN_SMEP, 5, 5;   \
-        altinstruction_entry 667b, 668b, X86_FEATURE_XEN_SMAP, 5, 5;   \
-        .popsection
+#define CR4_PV32_RESTORE                                \
+    ALTERNATIVE_2 __stringify(ASM_NOP5),                \
+        "call cr4_pv32_restore", X86_FEATURE_XEN_SMEP,  \
+        "call cr4_pv32_restore", X86_FEATURE_XEN_SMAP
 
 #else
 static always_inline void clac(void)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/alt: Support for automatic padding calculations
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (3 preceding siblings ...)
  2018-02-26 11:24 ` [PATCH v2] x86/asm: Remove opencoded uses of altinstruction_entry Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  2018-02-26 12:21   ` Roger Pau Monné
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop explicit padding of origin sites Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/build: Use new .nop directive when available Andrew Cooper
  6 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich

The correct amount of padding in an origin patch site can be calculated
automatically, based on the relative lengths of the replacements.

This requires a bit of trickery to calculate correctly, especially in the
ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
calculation is further complicated because GAS's idea of true is -1 rather
than 1, which is why the extra negations are required.

Additionally, have apply_alternatives() attempt to optimise the padding nops.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

v2: Fix build with Clang.
---
 xen/arch/x86/Rules.mk                 |  4 +++
 xen/arch/x86/alternative.c            | 32 ++++++++++++++++---
 xen/include/asm-x86/alternative-asm.h | 60 +++++++++++++++++++++++++++++------
 xen/include/asm-x86/alternative.h     | 46 +++++++++++++++++++++------
 4 files changed, 120 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index 9897dea..e169d67 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -24,6 +24,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
                      -U__OBJECT_LABEL__ -DHAVE_GAS_QUOTED_SYM \
                      '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@')
 
+# GCC's idea of true is -1.  Clang's idea is 1
+$(call as-option-add,CFLAGS,CC,\
+    ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
+
 CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
 
 # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 51ca53e..e24db84 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -180,13 +180,37 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
         uint8_t *orig = ALT_ORIG_PTR(a);
         uint8_t *repl = ALT_REPL_PTR(a);
         uint8_t buf[MAX_PATCH_LEN];
+        unsigned int total_len = a->orig_len + a->pad_len;
 
-        BUG_ON(a->repl_len > a->orig_len);
-        BUG_ON(a->orig_len > sizeof(buf));
+        BUG_ON(a->repl_len > total_len);
+        BUG_ON(total_len > sizeof(buf));
         BUG_ON(a->cpuid >= NCAPINTS * 32);
 
+        /* No replacement to make, but try to optimise any padding. */
         if ( !boot_cpu_has(a->cpuid) )
+        {
+            unsigned int i;
+
+            if ( a->pad_len <= 1 )
+                continue;
+
+            /* Search the padding area for any byte which isn't a nop. */
+            for ( i = a->orig_len; i < total_len; ++i )
+                if ( orig[i] != 0x90 )
+                    break;
+
+            /*
+             * Only make any changes if all padding bytes are unoptimised
+             * nops.  With multiple alternatives over the same origin site, we
+             * may have already made a replacement, or optimised the nops.
+             */
+            if ( i != total_len )
+                continue;
+
+            add_nops(buf, a->pad_len);
+            text_poke(orig + a->orig_len, buf, a->pad_len);
             continue;
+        }
 
         memcpy(buf, repl, a->repl_len);
 
@@ -194,8 +218,8 @@ void init_or_livepatch apply_alternatives(const struct alt_instr *start,
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
             *(int32_t *)(buf + 1) += repl - orig;
 
-        add_nops(buf + a->repl_len, a->orig_len - a->repl_len);
-        text_poke(orig, buf, a->orig_len);
+        add_nops(buf + a->repl_len, total_len - a->repl_len);
+        text_poke(orig, buf, total_len);
     }
 }
 
diff --git a/xen/include/asm-x86/alternative-asm.h b/xen/include/asm-x86/alternative-asm.h
index 150bd1a..25f79fe 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -9,30 +9,55 @@
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig repl feature orig_len repl_len
+.macro altinstruction_entry orig repl feature orig_len repl_len pad_len
     .long \orig - .
     .long \repl - .
     .word \feature
     .byte \orig_len
     .byte \repl_len
+    .byte \pad_len
 .endm
 
 #define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
+#define pad_len                (.L\@_orig_p       -     .L\@_orig_e)
+#define total_len              (.L\@_orig_p       -     .L\@_orig_s)
 #define repl_len(nr)           (.L\@_repl_e\()nr  -     .L\@_repl_s\()nr)
 #define decl_repl(insn, nr)     .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:
 
+/* GCC's idea of true is -1, while Clang's idea is 1. */
+#ifdef HAVE_AS_NEGATIVE_TRUE
+# define as_true(x) (-(x))
+#else
+# define as_true(x) (x)
+#endif
+
+#define as_max(a, b)           ((a) ^ (((a) ^ (b)) & -as_true((a) < (b))))
+
 .macro ALTERNATIVE oldinstr, newinstr, feature
 .L\@_orig_s:
     \oldinstr
 .L\@_orig_e:
+    /*
+     * Calculate the difference in size between the replacement and original
+     * instructions, to derive how much padding to introduce.
+     */
+    .L\@_diff = repl_len(1) - orig_len
+
+    .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90
+.L\@_orig_p:
 
     .pushsection .altinstructions, "a", @progbits
     altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature, \
-        orig_len, repl_len(1)
+        orig_len, repl_len(1), pad_len
 
     .section .discard, "a", @progbits
-    /* Assembler-time check that \newinstr isn't longer than \oldinstr. */
-    .byte 0xff + repl_len(1) - orig_len
+    /*
+     * Assembler-time checks:
+     *   - total_len <= 255
+     *   - \newinstr <= total_len
+     */
+    .byte total_len
+    .byte 0xff + repl_len(1) - total_len
 
     .section .altinstr_replacement, "ax", @progbits
 
@@ -45,18 +70,31 @@
 .L\@_orig_s:
     \oldinstr
 .L\@_orig_e:
+    /*
+     * Calculate the difference in size between the largest replacement and
+     * the original instructions, to derive how much padding to introduce.
+     */
+    .L\@_diff = as_max(repl_len(1), repl_len(2)) - orig_len
+
+     .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90
+.L\@_orig_p:
 
     .pushsection .altinstructions, "a", @progbits
 
     altinstruction_entry .L\@_orig_s, .L\@_repl_s1, \feature1, \
-        orig_len, repl_len(1)
+        orig_len, repl_len(1), pad_len
     altinstruction_entry .L\@_orig_s, .L\@_repl_s2, \feature2, \
-        orig_len, repl_len(2)
+        orig_len, repl_len(2), pad_len
 
     .section .discard, "a", @progbits
-    /* Assembler-time check that \newinstr{1,2} aren't longer than \oldinstr. */
-    .byte 0xff + repl_len(1) - orig_len
-    .byte 0xff + repl_len(2) - orig_len
+    /*
+     * Assembler-time checks:
+     *   - total_len <= 255
+     *   - \newinstr* <= total_len
+     */
+    .byte total_len
+    .byte 0xff + repl_len(1) - total_len
+    .byte 0xff + repl_len(2) - total_len
 
     .section .altinstr_replacement, "ax", @progbits
 
@@ -66,8 +104,12 @@
     .popsection
 .endm
 
+#undef as_max
+#undef as_true
 #undef decl_repl
 #undef repl_len
+#undef total_len
+#undef pad_len
 #undef orig_len
 
 #endif /* __ASSEMBLY__ */
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index bcad3ee..d53cea0 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -8,12 +8,13 @@
 #include <xen/stringify.h>
 #include <xen/types.h>
 
-struct alt_instr {
+struct __packed alt_instr {
     int32_t  orig_offset;   /* original instruction */
     int32_t  repl_offset;   /* offset to replacement instruction */
     uint16_t cpuid;         /* cpuid bit set for replacement */
     uint8_t  orig_len;      /* length of original instruction */
-    uint8_t  repl_len;      /* length of new instruction, <= instrlen */
+    uint8_t  repl_len;      /* length of new instruction */
+    uint8_t  pad_len;       /* length of build-time padding */
 };
 
 #define __ALT_PTR(a,f)      ((uint8_t *)((void *)&(a)->f + (a)->f))
@@ -26,43 +27,70 @@ extern void apply_alternatives(const struct alt_instr *start,
                                const struct alt_instr *end);
 extern void alternative_instructions(void);
 
-#define OLDINSTR(oldinstr) ".LXEN%=_orig_s:\n\t" oldinstr "\n.LXEN%=_orig_e:\n"
-
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
+#define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
+#define alt_total_len      "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
 #define alt_repl_s(num)    ".LXEN%=_repl_s"#num
 #define alt_repl_e(num)    ".LXEN%=_repl_e"#num
 #define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
 
+/* GCC's idea of true is -1, while Clang's idea is 1. */
+#ifdef HAVE_AS_NEGATIVE_TRUE
+# define AS_TRUE "-"
+#else
+# define AS_TRUE ""
+#endif
+
+#define as_max(a, b) "(("a") ^ ((("a") ^ ("b")) & -("AS_TRUE"(("a") < ("b")))))"
+
+#define OLDINSTR_1(oldinstr, n1)                                 \
+    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
+    ".LXEN%=_diff = "alt_repl_len(n1)"-"alt_orig_len"\n\t"       \
+    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+    ".LXEN%=_orig_p:\n\t"
+
+#define ALT_PADDING_LEN(n1, n2) \
+    as_max((alt_repl_len(n1), alt_repl_len(n2))"-"alt_orig_len
+
+#define OLDINSTR_2(oldinstr, n1, n2)                             \
+    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
+    ".LXEN%=_diff = "ALT_PADDING_LEN(n1, n2)"\n\t"               \
+    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+    ".LXEN%=_orig_p:\n\t"
+
 #define ALTINSTR_ENTRY(feature, num)                                    \
         " .long .LXEN%=_orig_s - .\n"             /* label           */ \
         " .long " alt_repl_s(num)" - .\n"         /* new instruction */ \
         " .word " __stringify(feature) "\n"       /* feature bit     */ \
         " .byte " alt_orig_len "\n"               /* source len      */ \
-        " .byte " alt_repl_len(num) "\n"          /* replacement len */
+        " .byte " alt_repl_len(num) "\n"          /* replacement len */ \
+        " .byte " alt_pad_len "\n"                /* padding len     */
 
-#define DISCARD_ENTRY(num)                        /* repl <= orig */    \
-        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_orig_len ")\n"
+#define DISCARD_ENTRY(num)                        /* repl <= total */   \
+        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_total_len ")\n"
 
 #define ALTINSTR_REPLACEMENT(newinstr, num)       /* replacement */     \
         alt_repl_s(num)":\n\t" newinstr "\n" alt_repl_e(num) ":\n\t"
 
 /* alternative assembly primitive: */
 #define ALTERNATIVE(oldinstr, newinstr, feature)                        \
-        OLDINSTR(oldinstr)                                              \
+        OLDINSTR_1(oldinstr, 1)                                         \
         ".pushsection .altinstructions, \"a\", @progbits\n"             \
         ALTINSTR_ENTRY(feature, 1)                                      \
         ".section .discard, \"a\", @progbits\n"                         \
+        ".byte " alt_total_len "\n" /* total_len <= 255 */              \
         DISCARD_ENTRY(1)                                                \
         ".section .altinstr_replacement, \"ax\", @progbits\n"           \
         ALTINSTR_REPLACEMENT(newinstr, 1)                               \
         ".popsection\n"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-        OLDINSTR(oldinstr)                                              \
+        OLDINSTR_2(oldinstr, 1, 2)                                      \
         ".pushsection .altinstructions, \"a\", @progbits\n"             \
         ALTINSTR_ENTRY(feature1, 1)                                     \
         ALTINSTR_ENTRY(feature2, 2)                                     \
         ".section .discard, \"a\", @progbits\n"                         \
+        ".byte " alt_total_len "\n" /* total_len <= 255 */              \
         DISCARD_ENTRY(1)                                                \
         DISCARD_ENTRY(2)                                                \
         ".section .altinstr_replacement, \"ax\", @progbits\n"           \
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/alt: Drop explicit padding of origin sites
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (4 preceding siblings ...)
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Support for automatic padding calculations Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  2018-02-26 11:24 ` [PATCH v2] x86/build: Use new .nop directive when available Andrew Cooper
  6 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Now that the alternatives infrastructure can calculate the required padding
automatically, there is no need to hard code it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/x86_64/compat/entry.S  |  2 +-
 xen/arch/x86/x86_64/entry.S         |  2 +-
 xen/include/asm-x86/nops.h          |  7 -------
 xen/include/asm-x86/spec_ctrl_asm.h | 19 ++++++++-----------
 4 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index 8aba269..f650610 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -134,7 +134,7 @@ ENTRY(compat_restore_all_guest)
         jne   1b
 2:
 .endm
-	ALTERNATIVE_2 ".skip 45, 0x90", \
+	ALTERNATIVE_2 "", \
             alt_cr4_pv32, X86_FEATURE_XEN_SMEP, \
             alt_cr4_pv32, X86_FEATURE_XEN_SMAP
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index e939f20..cc94333 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -564,7 +564,7 @@ handle_exception_saved:
         testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
         jz    exception_with_ints_disabled
 
-        ALTERNATIVE_2 "jmp .Lcr4_pv32_done; .skip 2, 0x90", \
+        ALTERNATIVE_2 "jmp .Lcr4_pv32_done", \
             __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMEP, \
             __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMAP
 
diff --git a/xen/include/asm-x86/nops.h b/xen/include/asm-x86/nops.h
index 61319cc..1a46b97 100644
--- a/xen/include/asm-x86/nops.h
+++ b/xen/include/asm-x86/nops.h
@@ -65,13 +65,6 @@
 #define ASM_NOP8 _ASM_MK_NOP(P6_NOP8)
 #define ASM_NOP9 _ASM_MK_NOP(P6_NOP9)
 
-#define ASM_NOP17 ASM_NOP8; ASM_NOP7; ASM_NOP2
-#define ASM_NOP21 ASM_NOP8; ASM_NOP8; ASM_NOP5
-#define ASM_NOP24 ASM_NOP8; ASM_NOP8; ASM_NOP8
-#define ASM_NOP29 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP5
-#define ASM_NOP32 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8
-#define ASM_NOP40 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8
-
 #define ASM_NOP_MAX 9
 
 #endif /* __X86_ASM_NOPS_H__ */
diff --git a/xen/include/asm-x86/spec_ctrl_asm.h b/xen/include/asm-x86/spec_ctrl_asm.h
index 1f2b6f3..1623fc0 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -216,9 +216,8 @@
 
 /* Use after a VMEXIT from an HVM guest. */
 #define SPEC_CTRL_ENTRY_FROM_VMEXIT                                     \
-    ALTERNATIVE __stringify(ASM_NOP40),                                 \
-        DO_OVERWRITE_RSB, X86_FEATURE_RSB_VMEXIT;                       \
-    ALTERNATIVE_2 __stringify(ASM_NOP32),                               \
+    ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_VMEXIT;           \
+    ALTERNATIVE_2 "",                                                   \
         __stringify(DO_SPEC_CTRL_ENTRY_FROM_VMEXIT                      \
                     ibrs_val=SPEC_CTRL_IBRS),                           \
         X86_FEATURE_XEN_IBRS_SET,                                       \
@@ -228,9 +227,8 @@
 
 /* Use after an entry from PV context (syscall/sysenter/int80/int82/etc). */
 #define SPEC_CTRL_ENTRY_FROM_PV                                         \
-    ALTERNATIVE __stringify(ASM_NOP40),                                 \
-        DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;                       \
-    ALTERNATIVE_2 __stringify(ASM_NOP21),                               \
+    ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;           \
+    ALTERNATIVE_2 "",                                                   \
         __stringify(DO_SPEC_CTRL_ENTRY maybexen=0                       \
                     ibrs_val=SPEC_CTRL_IBRS),                           \
         X86_FEATURE_XEN_IBRS_SET,                                       \
@@ -239,9 +237,8 @@
 
 /* Use in interrupt/exception context.  May interrupt Xen or PV context. */
 #define SPEC_CTRL_ENTRY_FROM_INTR                                       \
-    ALTERNATIVE __stringify(ASM_NOP40),                                 \
-        DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;                       \
-    ALTERNATIVE_2 __stringify(ASM_NOP29),                               \
+    ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;           \
+    ALTERNATIVE_2 "",                                                   \
         __stringify(DO_SPEC_CTRL_ENTRY maybexen=1                       \
                     ibrs_val=SPEC_CTRL_IBRS),                           \
         X86_FEATURE_XEN_IBRS_SET,                                       \
@@ -250,13 +247,13 @@
 
 /* Use when exiting to Xen context. */
 #define SPEC_CTRL_EXIT_TO_XEN                                           \
-    ALTERNATIVE_2 __stringify(ASM_NOP17),                               \
+    ALTERNATIVE_2 "",                                                   \
         DO_SPEC_CTRL_EXIT_TO_XEN, X86_FEATURE_XEN_IBRS_SET,             \
         DO_SPEC_CTRL_EXIT_TO_XEN, X86_FEATURE_XEN_IBRS_CLEAR
 
 /* Use when exiting to guest context. */
 #define SPEC_CTRL_EXIT_TO_GUEST                                         \
-    ALTERNATIVE_2 __stringify(ASM_NOP24),                               \
+    ALTERNATIVE_2 "",                                                   \
         DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_XEN_IBRS_SET,           \
         DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_XEN_IBRS_CLEAR
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] x86/build: Use new .nop directive when available
  2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
                   ` (5 preceding siblings ...)
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop explicit padding of origin sites Andrew Cooper
@ 2018-02-26 11:24 ` Andrew Cooper
  6 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-02-26 11:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich

Newer versions of binutils are capable of emitting an exact number bytes worth
of optimised nops.  Use this in preference to .skip when available.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

RFC until support is actually committed to binutils mainline.
---
 xen/arch/x86/Rules.mk                 |  4 ++++
 xen/include/asm-x86/alternative-asm.h | 14 ++++++++++++--
 xen/include/asm-x86/alternative.h     | 13 ++++++++++---
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index e169d67..bf5047f 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -28,6 +28,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
 $(call as-option-add,CFLAGS,CC,\
     ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
 
+# Check to see whether the assmbler supports the .nop directive.
+$(call as-option-add,CFLAGS,CC,\
+    ".L1: .L2: .nop (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
+
 CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
 
 # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
diff --git a/xen/include/asm-x86/alternative-asm.h b/xen/include/asm-x86/alternative-asm.h
index 25f79fe..9e46bed 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_ALTERNATIVE_ASM_H_
 #define _ASM_X86_ALTERNATIVE_ASM_H_
 
+#include <asm/nops.h>
+
 #ifdef __ASSEMBLY__
 
 /*
@@ -18,6 +20,14 @@
     .byte \pad_len
 .endm
 
+.macro mknops nr_bytes
+#ifdef HAVE_AS_NOP_DIRECTIVE
+    .nop \nr_bytes, ASM_NOP_MAX
+#else
+    .skip \nr_bytes, 0x90
+#endif
+.endm
+
 #define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
 #define pad_len                (.L\@_orig_p       -     .L\@_orig_e)
 #define total_len              (.L\@_orig_p       -     .L\@_orig_s)
@@ -43,7 +53,7 @@
      */
     .L\@_diff = repl_len(1) - orig_len
 
-    .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90
+     mknops (as_true(.L\@_diff > 0) * .L\@_diff)
 .L\@_orig_p:
 
     .pushsection .altinstructions, "a", @progbits
@@ -76,7 +86,7 @@
      */
     .L\@_diff = as_max(repl_len(1), repl_len(2)) - orig_len
 
-     .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90
+     mknops (as_true(.L\@_diff > 0) * .L\@_diff)
 .L\@_orig_p:
 
     .pushsection .altinstructions, "a", @progbits
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index d53cea0..51538b6 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -2,7 +2,6 @@
 #define __X86_ALTERNATIVE_H__
 
 #include <asm/alternative-asm.h>
-#include <asm/nops.h>
 
 #ifndef __ASSEMBLY__
 #include <xen/stringify.h>
@@ -27,6 +26,14 @@ extern void apply_alternatives(const struct alt_instr *start,
                                const struct alt_instr *end);
 extern void alternative_instructions(void);
 
+asm ( ".macro mknops nr_bytes\n\t"
+#ifdef HAVE_AS_NOP_DIRECTIVE
+      ".nop \\nr_bytes, " __stringify(ASM_NOP_MAX) "\n\t"
+#else
+      ".skip \\nr_bytes, 0x90\n\t"
+#endif
+      ".endm\n\t" );
+
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
 #define alt_total_len      "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
@@ -46,7 +53,7 @@ extern void alternative_instructions(void);
 #define OLDINSTR_1(oldinstr, n1)                                 \
     ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
     ".LXEN%=_diff = "alt_repl_len(n1)"-"alt_orig_len"\n\t"       \
-    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+    "mknops ("AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff)\n\t"    \
     ".LXEN%=_orig_p:\n\t"
 
 #define ALT_PADDING_LEN(n1, n2) \
@@ -55,7 +62,7 @@ extern void alternative_instructions(void);
 #define OLDINSTR_2(oldinstr, n1, n2)                             \
     ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
     ".LXEN%=_diff = "ALT_PADDING_LEN(n1, n2)"\n\t"               \
-    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+    "mknops ("AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff)\n\t"    \
     ".LXEN%=_orig_p:\n\t"
 
 #define ALTINSTR_ENTRY(feature, num)                                    \
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86/alt: Support for automatic padding calculations
  2018-02-26 11:24 ` [PATCH v2] x86/alt: Support for automatic padding calculations Andrew Cooper
@ 2018-02-26 12:21   ` Roger Pau Monné
  0 siblings, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2018-02-26 12:21 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Jan Beulich, Xen-devel

On Mon, Feb 26, 2018 at 11:24:55AM +0000, Andrew Cooper wrote:
> The correct amount of padding in an origin patch site can be calculated
> automatically, based on the relative lengths of the replacements.
> 
> This requires a bit of trickery to calculate correctly, especially in the
> ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
> calculation is further complicated because GAS's idea of true is -1 rather
> than 1, which is why the extra negations are required.
> 
> Additionally, have apply_alternatives() attempt to optimise the padding nops.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> 
> v2: Fix build with Clang.
> ---
>  xen/arch/x86/Rules.mk                 |  4 +++
>  xen/arch/x86/alternative.c            | 32 ++++++++++++++++---
>  xen/include/asm-x86/alternative-asm.h | 60 +++++++++++++++++++++++++++++------
>  xen/include/asm-x86/alternative.h     | 46 +++++++++++++++++++++------
>  4 files changed, 120 insertions(+), 22 deletions(-)
> 
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index 9897dea..e169d67 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -24,6 +24,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
>                       -U__OBJECT_LABEL__ -DHAVE_GAS_QUOTED_SYM \
>                       '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@')
>  
> +# GCC's idea of true is -1.  Clang's idea is 1

Nit: that should be GNU as rather than GCC itself?

>  #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
> +#define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
> +#define alt_total_len      "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
>  #define alt_repl_s(num)    ".LXEN%=_repl_s"#num
>  #define alt_repl_e(num)    ".LXEN%=_repl_e"#num
>  #define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
>  
> +/* GCC's idea of true is -1, while Clang's idea is 1. */
> +#ifdef HAVE_AS_NEGATIVE_TRUE
> +# define AS_TRUE "-"
> +#else
> +# define AS_TRUE ""
> +#endif
> +
> +#define as_max(a, b) "(("a") ^ ((("a") ^ ("b")) & -("AS_TRUE"(("a") < ("b")))))"
> +
> +#define OLDINSTR_1(oldinstr, n1)                                 \
> +    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
> +    ".LXEN%=_diff = "alt_repl_len(n1)"-"alt_orig_len"\n\t"       \
> +    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
> +    ".LXEN%=_orig_p:\n\t"
> +
> +#define ALT_PADDING_LEN(n1, n2) \
> +    as_max((alt_repl_len(n1), alt_repl_len(n2))"-"alt_orig_len
> +
> +#define OLDINSTR_2(oldinstr, n1, n2)                             \
> +    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
> +    ".LXEN%=_diff = "ALT_PADDING_LEN(n1, n2)"\n\t"               \
> +    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
> +    ".LXEN%=_orig_p:\n\t"

OLDINSTR_1 is mostly the same as OLDINSTR_2, I wonder whether:

#define OLDINSTR(oldinstr, pad)                                  \
    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
    ".LXEN%=_diff = " pad "\n\t"                                 \
    ".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
    ".LXEN%=_orig_p:\n\t"

and then:

#define OLDINSTR_1(oldinstr, n1) \
    OLDINSTR(oldinstr, alt_repl_len(n1)"-"alt_orig_len)
#define OLDINSTR_2(oldinstr, n1, n2) \
    OLDINSTR(oldinstr, ALT_PADDING_LEN(n1, n2))

Wouldn't work? That seems to avoid some code repetition.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-02-26 12:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-26 11:24 [PATCH v2] x86/alternatives: Support for automatic padding calculations Andrew Cooper
2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop unused alternative infrastructure Andrew Cooper
2018-02-26 11:24 ` [PATCH v2] x86/alt: Clean up struct alt_instr and its users Andrew Cooper
2018-02-26 11:24 ` [PATCH v2] x86/alt: Clean up the assembly used to generate alternatives Andrew Cooper
2018-02-26 11:24 ` [PATCH v2] x86/asm: Remove opencoded uses of altinstruction_entry Andrew Cooper
2018-02-26 11:24 ` [PATCH v2] x86/alt: Support for automatic padding calculations Andrew Cooper
2018-02-26 12:21   ` Roger Pau Monné
2018-02-26 11:24 ` [PATCH v2] x86/alt: Drop explicit padding of origin sites Andrew Cooper
2018-02-26 11:24 ` [PATCH v2] x86/build: Use new .nop directive when available Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).