All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: combine memmove FSRM and ERMS alternatives
@ 2023-01-13 20:34 Daniel Verkamp
  2023-01-14  6:55 ` Greg KH
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Daniel Verkamp @ 2023-01-13 20:34 UTC (permalink / raw)
  To: x86, Tony Luck, Borislav Petkov; +Cc: Jiri Slaby, stable, Daniel Verkamp

The x86-64 memmove code has two ALTERNATIVE statements in a row, one to
handle FSRM ("Fast Short REP MOVSB"), and one to handle ERMS ("Enhanced
REP MOVSB"). If either of these features is present, the goal is to jump
directly to a REP MOVSB; otherwise, some setup code that handles short
lengths is executed. The first comparison of a sequence of specific
small sizes is included in the first ALTERNATIVE, so it will be replaced
by NOPs if FSRM is set, and then (assuming ERMS is also set) execution
will fall through to the JMP to a REP MOVSB in the next ALTERNATIVE.

The two ALTERNATIVE invocations can be combined into a single instance
of ALTERNATIVE_2 to simplify and slightly shorten the code. If either
FSRM or ERMS is set, the first instruction in the memmove_begin_forward
path will be replaced with a jump to the REP MOVSB.

This also prevents a problem when FSRM is set but ERMS is not; in this
case, the previous code would have replaced both ALTERNATIVEs with NOPs
and skipped the first check for sizes less than 0x20 bytes. This
combination of CPU features is arguably a firmware bug, but this patch
makes the function robust against this badness.

Fixes: f444a5ff95dc ("x86/cpufeatures: Add support for fast short REP; MOVSB")
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
---
 arch/x86/lib/memmove_64.S | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
index 724bbf83eb5b..1fc36dbd3bdc 100644
--- a/arch/x86/lib/memmove_64.S
+++ b/arch/x86/lib/memmove_64.S
@@ -38,8 +38,10 @@ SYM_FUNC_START(__memmove)
 
 	/* FSRM implies ERMS => no length checks, do the copy directly */
 .Lmemmove_begin_forward:
-	ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM
-	ALTERNATIVE "", "jmp .Lmemmove_erms", X86_FEATURE_ERMS
+	ALTERNATIVE_2 \
+		"cmp $0x20, %rdx; jb 1f", \
+		"jmp .Lmemmove_erms", X86_FEATURE_FSRM, \
+		"jmp .Lmemmove_erms", X86_FEATURE_ERMS
 
 	/*
 	 * movsq instruction have many startup latency
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread
* [PATCH] x86: combine memmove FSRM and ERMS alternatives
@ 2023-01-13 20:32 Daniel Verkamp
  2023-01-14  6:55 ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Verkamp @ 2023-01-13 20:32 UTC (permalink / raw)
  To: Tony Luck, Borislav Petkov; +Cc: Jiri Slaby, stable, Daniel Verkamp

The x86-64 memmove code has two ALTERNATIVE statements in a row, one to
handle FSRM ("Fast Short REP MOVSB"), and one to handle ERMS ("Enhanced
REP MOVSB"). If either of these features is present, the goal is to jump
directly to a REP MOVSB; otherwise, some setup code that handles short
lengths is executed. The first comparison of a sequence of specific
small sizes is included in the first ALTERNATIVE, so it will be replaced
by NOPs if FSRM is set, and then (assuming ERMS is also set) execution
will fall through to the JMP to a REP MOVSB in the next ALTERNATIVE.

The two ALTERNATIVE invocations can be combined into a single instance
of ALTERNATIVE_2 to simplify and slightly shorten the code. If either
FSRM or ERMS is set, the first instruction in the memmove_begin_forward
path will be replaced with a jump to the REP MOVSB.

This also prevents a problem when FSRM is set but ERMS is not; in this
case, the previous code would have replaced both ALTERNATIVEs with NOPs
and skipped the first check for sizes less than 0x20 bytes. This
combination of CPU features is arguably a firmware bug, but this patch
makes the function robust against this badness.

Fixes: f444a5ff95dc ("x86/cpufeatures: Add support for fast short REP; MOVSB")
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
---
 arch/x86/lib/memmove_64.S | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
index 724bbf83eb5b..1fc36dbd3bdc 100644
--- a/arch/x86/lib/memmove_64.S
+++ b/arch/x86/lib/memmove_64.S
@@ -38,8 +38,10 @@ SYM_FUNC_START(__memmove)
 
 	/* FSRM implies ERMS => no length checks, do the copy directly */
 .Lmemmove_begin_forward:
-	ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM
-	ALTERNATIVE "", "jmp .Lmemmove_erms", X86_FEATURE_ERMS
+	ALTERNATIVE_2 \
+		"cmp $0x20, %rdx; jb 1f", \
+		"jmp .Lmemmove_erms", X86_FEATURE_FSRM, \
+		"jmp .Lmemmove_erms", X86_FEATURE_ERMS
 
 	/*
 	 * movsq instruction have many startup latency
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-15 23:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-13 20:34 [PATCH] x86: combine memmove FSRM and ERMS alternatives Daniel Verkamp
2023-01-14  6:55 ` Greg KH
2023-01-14  9:24 ` Ingo Molnar
2023-01-14 10:42 ` Borislav Petkov
2023-01-14 16:17   ` Borislav Petkov
2023-01-14 20:49     ` Borislav Petkov
2023-01-15 23:49       ` Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2023-01-13 20:32 Daniel Verkamp
2023-01-14  6:55 ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.