public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives
@ 2025-08-08 17:23 Sean Christopherson
  2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Sean Christopherson @ 2025-08-08 17:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: linux-kernel, Dan Snyder, Sean Christopherson

Fix two false positives scenarios where the UMIP #GP logic will incorrectly
trigger emulation, e.g. due to a partially decoded instruction, or on
instructions like VMLAUNCH that usurp the register form of '0f 01'.

Tested with the hack-a-test patch at the end, but I haven't done any testing
using a real userspace (neither positive nor negative testing).

Sean Christopherson (3):
  x86/umip: Check that the instruction opcode is at least two bytes
  x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT
    aliases)
  *** DO NOT MERGE *** x86/umip: Lazy person's KUnit test for UMIP
    emulation

 arch/x86/kernel/umip.c | 71 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 69 insertions(+), 2 deletions(-)


base-commit: ce0b5eedcb753697d43f61dd2e27d68eb5d3150f
-- 
2.50.1.703.g449372360f-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes
  2025-08-08 17:23 [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
@ 2025-08-08 17:23 ` Sean Christopherson
  2025-08-14 14:42   ` Peter Zijlstra
                     ` (2 more replies)
  2025-08-08 17:23 ` [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases) Sean Christopherson
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 13+ messages in thread
From: Sean Christopherson @ 2025-08-08 17:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: linux-kernel, Dan Snyder, Sean Christopherson

When checking for a potential UMIP violation on #GP, verify the decoder
found at least two opcode bytes to avoid false positives when the kernel
encounters an unknown instruction that starts with 0f.  Because the array
of opcode.bytes is zero-initialized by insn_init(), peeking at bytes[1]
will misinterpret garbage as a potential SLDT or STR instruction, and can
incorrectly trigger emulation.

E.g. if a vpalignr instruction

   62 83 c5 05 0f 08 ff     vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff

hits a #GP, the kernel emulates it as STR and squashes the #GP (and
corrupts the userspace code stream).

Arguably the check should look for exactly two bytes, but no three byte
opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
impossible to get a false positive if the first two opcode bytes match
'0f 00' or '0f 01'.  Go with a more conservative check with respect to the
existing code to minimize the chances of breaking userspace, e.g. due to
decoder weirdness.

Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Reported-by: Dan Snyder <dansnyder@google.com>
Analyzed-by; Nick Bray <ncbray@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/umip.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5a4b21389b1d..406ac01ce16d 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
 	if (!insn->modrm.nbytes)
 		return -EINVAL;
 
-	/* All the instructions of interest start with 0x0f. */
-	if (insn->opcode.bytes[0] != 0xf)
+	/* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
+	if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
 		return -EINVAL;
 
 	if (insn->opcode.bytes[1] == 0x1) {
-- 
2.50.1.703.g449372360f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)
  2025-08-08 17:23 [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
  2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
@ 2025-08-08 17:23 ` Sean Christopherson
  2025-08-14 14:42   ` Peter Zijlstra
  2025-09-19 19:55   ` [tip: x86/cpu] " tip-bot2 for Sean Christopherson
  2025-08-08 17:23 ` [PATCH 3/3] *** DO NOT MERGE *** x86/umip: Lazy person's KUnit test for UMIP emulation Sean Christopherson
  2025-09-19 14:31 ` [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
  3 siblings, 2 replies; 13+ messages in thread
From: Sean Christopherson @ 2025-08-08 17:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: linux-kernel, Dan Snyder, Sean Christopherson

Filter out the register forms of 0F 01 when determining whether or not to
emulate in response to a potential UMIP violation #GP, as SGDT and SIDT
only accept memory operands.  The register variants of 0F 01 are used to
encode instructions for things like VMX and SGX, i.e. not checking the Mod
field would cause the kernel incorrectly emulate on #GP, e.g. due to a CPL
violation on VMLAUNCH.

Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/umip.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 406ac01ce16d..d432f3824f0c 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -163,8 +163,19 @@ static int identify_insn(struct insn *insn)
 	if (insn->opcode.bytes[1] == 0x1) {
 		switch (X86_MODRM_REG(insn->modrm.value)) {
 		case 0:
+			/* The reg form of 0F 01 /0 encodes VMX instructions. */
+			if (X86_MODRM_MOD(insn->modrm.value) == 3)
+				return -EINVAL;
+
 			return UMIP_INST_SGDT;
 		case 1:
+			/*
+			 * The reg form of 0F 01 /1 encodes MONITOR/MWAIT,
+			 * STAC/CLAC, and ENCLS.
+			 */
+			if (X86_MODRM_MOD(insn->modrm.value) == 3)
+				return -EINVAL;
+
 			return UMIP_INST_SIDT;
 		case 4:
 			return UMIP_INST_SMSW;
-- 
2.50.1.703.g449372360f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] *** DO NOT MERGE *** x86/umip: Lazy person's KUnit test for UMIP emulation
  2025-08-08 17:23 [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
  2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
  2025-08-08 17:23 ` [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases) Sean Christopherson
@ 2025-08-08 17:23 ` Sean Christopherson
  2025-09-19 14:31 ` [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
  3 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2025-08-08 17:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: linux-kernel, Dan Snyder, Sean Christopherson

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/umip.c | 56 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index d432f3824f0c..26621d5ea308 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -194,6 +194,62 @@ static int identify_insn(struct insn *insn)
 	}
 }
 
+static __init int umip_identify_insn(const unsigned char *buf)
+{
+	struct insn insn = {
+		.addr_bytes = 8,
+		.opnd_bytes = 4,
+	};
+	int r;
+
+	insn_init(&insn, buf, MAX_INSN_SIZE, 1);
+
+	r = insn_get_length(&insn);
+	if (r) {
+		pr_warn("insn_get_length returned '%d'\n", r);
+		return r;
+	}
+
+	return identify_insn(&insn);
+}
+
+static __init int umip_insn_test(void)
+{
+	unsigned char vpalignr[MAX_INSN_SIZE] = { 0x62, 0x83, 0xc5, 0x05, 0x0f, 0x08, 0xff };
+	unsigned char insn_0f00[MAX_INSN_SIZE] = { 0x0f, 0x00 };
+	unsigned char insn_0f01[MAX_INSN_SIZE] = { 0x0f, 0x01 };
+	int r, i;
+
+	r = umip_identify_insn(vpalignr);
+	WARN_ON(r != -EINVAL);
+
+	for (i = 0; i <= 0xff; i++) {
+		insn_0f00[2] = i;
+		r = umip_identify_insn(insn_0f00);
+		if (X86_MODRM_REG(i) > 1)
+			WARN_ON(r != -EINVAL);
+		else if (X86_MODRM_REG(i) == 0)
+			WARN_ON(r != UMIP_INST_SLDT);
+		else
+			WARN_ON(r != UMIP_INST_STR);
+
+		insn_0f01[2] = i;
+		r = umip_identify_insn(insn_0f01);
+		if (X86_MODRM_REG(i) == 2 || X86_MODRM_REG(i) == 3 || X86_MODRM_REG(i) > 4)
+			WARN_ON(r != -EINVAL);
+		else if (X86_MODRM_REG(i) < 2 && i >= 0xc0)
+			WARN_ON(r != -EINVAL);
+		else if (X86_MODRM_REG(i) == 0)
+			WARN_ON(r != UMIP_INST_SGDT);
+		else if (X86_MODRM_REG(i) == 1)
+			WARN_ON(r != UMIP_INST_SIDT);
+		else
+			WARN_ON(r != UMIP_INST_SMSW);
+	}
+	return 0;
+}
+subsys_initcall(umip_insn_test);
+
 /**
  * emulate_umip_insn() - Emulate UMIP instructions and return dummy values
  * @insn:	Instruction structure with operands
-- 
2.50.1.703.g449372360f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes
  2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
@ 2025-08-14 14:42   ` Peter Zijlstra
  2025-09-19 18:16   ` Borislav Petkov
  2025-09-19 19:55   ` [tip: x86/cpu] " tip-bot2 for Sean Christopherson
  2 siblings, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2025-08-14 14:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	linux-kernel, Dan Snyder

On Fri, Aug 08, 2025 at 10:23:56AM -0700, Sean Christopherson wrote:
> When checking for a potential UMIP violation on #GP, verify the decoder
> found at least two opcode bytes to avoid false positives when the kernel
> encounters an unknown instruction that starts with 0f.  Because the array
> of opcode.bytes is zero-initialized by insn_init(), peeking at bytes[1]
> will misinterpret garbage as a potential SLDT or STR instruction, and can
> incorrectly trigger emulation.
> 
> E.g. if a vpalignr instruction
> 
>    62 83 c5 05 0f 08 ff     vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
> 
> hits a #GP, the kernel emulates it as STR and squashes the #GP (and
> corrupts the userspace code stream).
> 
> Arguably the check should look for exactly two bytes, but no three byte
> opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
> impossible to get a false positive if the first two opcode bytes match
> '0f 00' or '0f 01'.  Go with a more conservative check with respect to the
> existing code to minimize the chances of breaking userspace, e.g. due to
> decoder weirdness.
> 
> Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
> Reported-by: Dan Snyder <dansnyder@google.com>
> Analyzed-by; Nick Bray <ncbray@google.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  arch/x86/kernel/umip.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> index 5a4b21389b1d..406ac01ce16d 100644
> --- a/arch/x86/kernel/umip.c
> +++ b/arch/x86/kernel/umip.c
> @@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
>  	if (!insn->modrm.nbytes)
>  		return -EINVAL;
>  
> -	/* All the instructions of interest start with 0x0f. */
> -	if (insn->opcode.bytes[0] != 0xf)
> +	/* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
> +	if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
>  		return -EINVAL;
>  
>  	if (insn->opcode.bytes[1] == 0x1) {
> -- 
> 2.50.1.703.g449372360f-goog
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)
  2025-08-08 17:23 ` [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases) Sean Christopherson
@ 2025-08-14 14:42   ` Peter Zijlstra
  2025-09-19 19:55   ` [tip: x86/cpu] " tip-bot2 for Sean Christopherson
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2025-08-14 14:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	linux-kernel, Dan Snyder

On Fri, Aug 08, 2025 at 10:23:57AM -0700, Sean Christopherson wrote:
> Filter out the register forms of 0F 01 when determining whether or not to
> emulate in response to a potential UMIP violation #GP, as SGDT and SIDT
> only accept memory operands.  The register variants of 0F 01 are used to
> encode instructions for things like VMX and SGX, i.e. not checking the Mod
> field would cause the kernel incorrectly emulate on #GP, e.g. due to a CPL
> violation on VMLAUNCH.
> 
> Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  arch/x86/kernel/umip.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> index 406ac01ce16d..d432f3824f0c 100644
> --- a/arch/x86/kernel/umip.c
> +++ b/arch/x86/kernel/umip.c
> @@ -163,8 +163,19 @@ static int identify_insn(struct insn *insn)
>  	if (insn->opcode.bytes[1] == 0x1) {
>  		switch (X86_MODRM_REG(insn->modrm.value)) {
>  		case 0:
> +			/* The reg form of 0F 01 /0 encodes VMX instructions. */
> +			if (X86_MODRM_MOD(insn->modrm.value) == 3)
> +				return -EINVAL;
> +
>  			return UMIP_INST_SGDT;
>  		case 1:
> +			/*
> +			 * The reg form of 0F 01 /1 encodes MONITOR/MWAIT,
> +			 * STAC/CLAC, and ENCLS.
> +			 */
> +			if (X86_MODRM_MOD(insn->modrm.value) == 3)
> +				return -EINVAL;
> +
>  			return UMIP_INST_SIDT;
>  		case 4:
>  			return UMIP_INST_SMSW;
> -- 
> 2.50.1.703.g449372360f-goog
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives
  2025-08-08 17:23 [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
                   ` (2 preceding siblings ...)
  2025-08-08 17:23 ` [PATCH 3/3] *** DO NOT MERGE *** x86/umip: Lazy person's KUnit test for UMIP emulation Sean Christopherson
@ 2025-09-19 14:31 ` Sean Christopherson
  2025-09-19 14:46   ` Borislav Petkov
  3 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2025-09-19 14:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	linux-kernel, Dan Snyder

On Fri, Aug 08, 2025, Sean Christopherson wrote:
> Fix two false positives scenarios where the UMIP #GP logic will incorrectly
> trigger emulation, e.g. due to a partially decoded instruction, or on
> instructions like VMLAUNCH that usurp the register form of '0f 01'.
> 
> Tested with the hack-a-test patch at the end, but I haven't done any testing
> using a real userspace (neither positive nor negative testing).
> 
> Sean Christopherson (3):
>   x86/umip: Check that the instruction opcode is at least two bytes
>   x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT
>     aliases)

Ping on these two, looks like they slipped through the cracks.  FWIW, I wouldn't
consider these urgent enough to squeeze into 6.17, but it'd be nice to get them
into 6.18.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives
  2025-09-19 14:31 ` [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
@ 2025-09-19 14:46   ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2025-09-19 14:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, linux-kernel,
	Dan Snyder

On Fri, Sep 19, 2025 at 07:31:16AM -0700, Sean Christopherson wrote:
> On Fri, Aug 08, 2025, Sean Christopherson wrote:
> > Fix two false positives scenarios where the UMIP #GP logic will incorrectly
> > trigger emulation, e.g. due to a partially decoded instruction, or on
> > instructions like VMLAUNCH that usurp the register form of '0f 01'.
> > 
> > Tested with the hack-a-test patch at the end, but I haven't done any testing
> > using a real userspace (neither positive nor negative testing).
> > 
> > Sean Christopherson (3):
> >   x86/umip: Check that the instruction opcode is at least two bytes
> >   x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT
> >     aliases)
> 
> Ping on these two, looks like they slipped through the cracks.  FWIW, I wouldn't
> consider these urgent enough to squeeze into 6.17, but it'd be nice to get them
> into 6.18.

Lemme take a look...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes
  2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
  2025-08-14 14:42   ` Peter Zijlstra
@ 2025-09-19 18:16   ` Borislav Petkov
  2025-09-19 21:24     ` Sean Christopherson
  2025-09-19 19:55   ` [tip: x86/cpu] " tip-bot2 for Sean Christopherson
  2 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2025-09-19 18:16 UTC (permalink / raw)
  To: Sean Christopherson, Masami Hiramatsu
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, linux-kernel,
	Dan Snyder

On Fri, Aug 08, 2025 at 10:23:56AM -0700, Sean Christopherson wrote:
> When checking for a potential UMIP violation on #GP, verify the decoder
> found at least two opcode bytes to avoid false positives when the kernel
> encounters an unknown instruction that starts with 0f.  Because the array
> of opcode.bytes is zero-initialized by insn_init(), peeking at bytes[1]
> will misinterpret garbage as a potential SLDT or STR instruction, and can
> incorrectly trigger emulation.
> 
> E.g. if a vpalignr instruction
> 
>    62 83 c5 05 0f 08 ff     vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
> 
> hits a #GP, the kernel emulates it as STR and squashes the #GP (and
> corrupts the userspace code stream).
> 
> Arguably the check should look for exactly two bytes, but no three byte
> opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
> impossible to get a false positive if the first two opcode bytes match
> '0f 00' or '0f 01'.  Go with a more conservative check with respect to the
> existing code to minimize the chances of breaking userspace, e.g. due to
> decoder weirdness.

So I did some staring... I guess this fix is trying to address our insn
decoder shortcoming and calls it "weirdness", right?

$ objdump -d a.out | awk -f ./arch/x86/tools/objdump_reformat.awk | ./arch/x86/tools/insn_decoder_test 
./arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
./arch/x86/tools/insn_decoder_test: warning:    0:      62 83 c5 05 0f 08 ff    vpalignr $0xff,(%r8),%xmm23,%xmm17{%k5}
./arch/x86/tools/insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6
./arch/x86/tools/insn_decoder_test: warning: Decoded and checked 1 instructions with 1 failures

Looks like it.

a.out has:

0000000000000000 <.text>:
   0:   62 83 c5 05 0f 08 ff    vpalignr $0xff,(%r8),%xmm23,%xmm17{%k5}

I guess just adding the insn to the table doesn't fix it.

Masami?

---
diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 262f7ca1fb95..a23ff3c16908 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -23,7 +23,7 @@
 #
 # AVX Superscripts
 #  (ev): this opcode requires EVEX prefix.
-#  (es): this opcode requires EVEX prefix and is SCALABALE.
+#  (es): this opcode requires EVEX prefix and is SCALABLE.
 #  (evo): this opcode is changed by EVEX prefix (EVEX opcode)
 #  (v): this opcode requires VEX prefix.
 #  (v1): this opcode only supports 128bit VEX.
@@ -867,7 +867,7 @@ AVXcode: 3
 0c: vblendps Vx,Hx,Wx,Ib (66)
 0d: vblendpd Vx,Hx,Wx,Ib (66)
 0e: vpblendw Vx,Hx,Wx,Ib (66),(v1)
-0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1)
+0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1) | vpalignr Vx,kz,Hx,Wx,Ib (ev)
 14: vpextrb Rd/Mb,Vdq,Ib (66),(v1)
 15: vpextrw Rd/Mw,Vdq,Ib (66),(v1)
 16: vpextrd/q Ey,Vdq,Ib (66),(v1)


> Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
> Reported-by: Dan Snyder <dansnyder@google.com>
> Analyzed-by; Nick Bray <ncbray@google.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kernel/umip.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> index 5a4b21389b1d..406ac01ce16d 100644
> --- a/arch/x86/kernel/umip.c
> +++ b/arch/x86/kernel/umip.c
> @@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
>  	if (!insn->modrm.nbytes)
>  		return -EINVAL;
>  
> -	/* All the instructions of interest start with 0x0f. */
> -	if (insn->opcode.bytes[0] != 0xf)
> +	/* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
> +	if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
>  		return -EINVAL;
>  
>  	if (insn->opcode.bytes[1] == 0x1) {
> -- 
> 2.50.1.703.g449372360f-goog
> 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip: x86/cpu] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)
  2025-08-08 17:23 ` [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases) Sean Christopherson
  2025-08-14 14:42   ` Peter Zijlstra
@ 2025-09-19 19:55   ` tip-bot2 for Sean Christopherson
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot2 for Sean Christopherson @ 2025-09-19 19:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Sean Christopherson, Borislav Petkov (AMD),
	Peter Zijlstra (Intel), stable, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     27b1fd62012dfe9d3eb8ecde344d7aa673695ecf
Gitweb:        https://git.kernel.org/tip/27b1fd62012dfe9d3eb8ecde344d7aa673695ecf
Author:        Sean Christopherson <seanjc@google.com>
AuthorDate:    Fri, 08 Aug 2025 10:23:57 -07:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 19 Sep 2025 21:34:48 +02:00

x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)

Filter out the register forms of 0F 01 when determining whether or not to
emulate in response to a potential UMIP violation #GP, as SGDT and SIDT only
accept memory operands.  The register variants of 0F 01 are used to encode
instructions for things like VMX and SGX, i.e. not checking the Mod field
would cause the kernel to incorrectly emulate on #GP, e.g. due to a CPL
violation on VMLAUNCH.

Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/umip.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 406ac01..d432f38 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -163,8 +163,19 @@ static int identify_insn(struct insn *insn)
 	if (insn->opcode.bytes[1] == 0x1) {
 		switch (X86_MODRM_REG(insn->modrm.value)) {
 		case 0:
+			/* The reg form of 0F 01 /0 encodes VMX instructions. */
+			if (X86_MODRM_MOD(insn->modrm.value) == 3)
+				return -EINVAL;
+
 			return UMIP_INST_SGDT;
 		case 1:
+			/*
+			 * The reg form of 0F 01 /1 encodes MONITOR/MWAIT,
+			 * STAC/CLAC, and ENCLS.
+			 */
+			if (X86_MODRM_MOD(insn->modrm.value) == 3)
+				return -EINVAL;
+
 			return UMIP_INST_SIDT;
 		case 4:
 			return UMIP_INST_SMSW;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip: x86/cpu] x86/umip: Check that the instruction opcode is at least two bytes
  2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
  2025-08-14 14:42   ` Peter Zijlstra
  2025-09-19 18:16   ` Borislav Petkov
@ 2025-09-19 19:55   ` tip-bot2 for Sean Christopherson
  2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Sean Christopherson @ 2025-09-19 19:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dan Snyder, Sean Christopherson, Borislav Petkov (AMD),
	Peter Zijlstra (Intel), stable, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     32278c677947ae2f042c9535674a7fff9a245dd3
Gitweb:        https://git.kernel.org/tip/32278c677947ae2f042c9535674a7fff9a245dd3
Author:        Sean Christopherson <seanjc@google.com>
AuthorDate:    Fri, 08 Aug 2025 10:23:56 -07:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 19 Sep 2025 20:21:12 +02:00

x86/umip: Check that the instruction opcode is at least two bytes

When checking for a potential UMIP violation on #GP, verify the decoder found
at least two opcode bytes to avoid false positives when the kernel encounters
an unknown instruction that starts with 0f.  Because the array of opcode.bytes
is zero-initialized by insn_init(), peeking at bytes[1] will misinterpret
garbage as a potential SLDT or STR instruction, and can incorrectly trigger
emulation.

E.g. if a VPALIGNR instruction

   62 83 c5 05 0f 08 ff     vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff

hits a #GP, the kernel emulates it as STR and squashes the #GP (and corrupts
the userspace code stream).

Arguably the check should look for exactly two bytes, but no three byte
opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
impossible to get a false positive if the first two opcode bytes match '0f 00'
or '0f 01'.  Go with a more conservative check with respect to the existing
code to minimize the chances of breaking userspace, e.g. due to decoder
weirdness.

Analyzed by Nick Bray <ncbray@google.com>.

Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Reported-by: Dan Snyder <dansnyder@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/umip.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5a4b213..406ac01 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
 	if (!insn->modrm.nbytes)
 		return -EINVAL;
 
-	/* All the instructions of interest start with 0x0f. */
-	if (insn->opcode.bytes[0] != 0xf)
+	/* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
+	if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
 		return -EINVAL;
 
 	if (insn->opcode.bytes[1] == 0x1) {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes
  2025-09-19 18:16   ` Borislav Petkov
@ 2025-09-19 21:24     ` Sean Christopherson
  2025-09-20 10:07       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2025-09-19 21:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Masami Hiramatsu, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
	linux-kernel, Dan Snyder

On Fri, Sep 19, 2025, Borislav Petkov wrote:
> On Fri, Aug 08, 2025 at 10:23:56AM -0700, Sean Christopherson wrote:
> > When checking for a potential UMIP violation on #GP, verify the decoder
> > found at least two opcode bytes to avoid false positives when the kernel
> > encounters an unknown instruction that starts with 0f.  Because the array
> > of opcode.bytes is zero-initialized by insn_init(), peeking at bytes[1]
> > will misinterpret garbage as a potential SLDT or STR instruction, and can
> > incorrectly trigger emulation.
> > 
> > E.g. if a vpalignr instruction
> > 
> >    62 83 c5 05 0f 08 ff     vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
> > 
> > hits a #GP, the kernel emulates it as STR and squashes the #GP (and
> > corrupts the userspace code stream).
> > 
> > Arguably the check should look for exactly two bytes, but no three byte
> > opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
> > impossible to get a false positive if the first two opcode bytes match
> > '0f 00' or '0f 01'.  Go with a more conservative check with respect to the
> > existing code to minimize the chances of breaking userspace, e.g. due to
> > decoder weirdness.
> 
> So I did some staring... I guess this fix is trying to address our insn
> decoder shortcoming and calls it "weirdness", right?
> 
> $ objdump -d a.out | awk -f ./arch/x86/tools/objdump_reformat.awk | ./arch/x86/tools/insn_decoder_test 
> ./arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
> ./arch/x86/tools/insn_decoder_test: warning:    0:      62 83 c5 05 0f 08 ff    vpalignr $0xff,(%r8),%xmm23,%xmm17{%k5}
> ./arch/x86/tools/insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6
> ./arch/x86/tools/insn_decoder_test: warning: Decoded and checked 1 instructions with 1 failures
> 
> Looks like it.
> 
> a.out has:
> 
> 0000000000000000 <.text>:
>    0:   62 83 c5 05 0f 08 ff    vpalignr $0xff,(%r8),%xmm23,%xmm17{%k5}
> 
> I guess just adding the insn to the table doesn't fix it.

vpalignr is just one example of an unhandled instruction, that's not what I find
weird.

The "weirdness" I am referring to is purely speculative; what I was trying to say
is that I deliberate went with a "bad" check on nbytes, i.e. it really should be
"insn->opcode.nbytes == 2".  But I didn't want to risk breaking some bizarre
userspace that happened to be relying on a quirk of the kernel's decoder (I
haven't dug into the decoder, so I genuinely have/had no idea what all could
happen).

> Masami?
> 
> ---
> diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
> index 262f7ca1fb95..a23ff3c16908 100644
> --- a/arch/x86/lib/x86-opcode-map.txt
> +++ b/arch/x86/lib/x86-opcode-map.txt
> @@ -23,7 +23,7 @@
>  #
>  # AVX Superscripts
>  #  (ev): this opcode requires EVEX prefix.
> -#  (es): this opcode requires EVEX prefix and is SCALABALE.
> +#  (es): this opcode requires EVEX prefix and is SCALABLE.
>  #  (evo): this opcode is changed by EVEX prefix (EVEX opcode)
>  #  (v): this opcode requires VEX prefix.
>  #  (v1): this opcode only supports 128bit VEX.
> @@ -867,7 +867,7 @@ AVXcode: 3
>  0c: vblendps Vx,Hx,Wx,Ib (66)
>  0d: vblendpd Vx,Hx,Wx,Ib (66)
>  0e: vpblendw Vx,Hx,Wx,Ib (66),(v1)
> -0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1)
> +0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1) | vpalignr Vx,kz,Hx,Wx,Ib (ev)
>  14: vpextrb Rd/Mb,Vdq,Ib (66),(v1)
>  15: vpextrw Rd/Mw,Vdq,Ib (66),(v1)
>  16: vpextrd/q Ey,Vdq,Ib (66),(v1)
> 
> 
> > Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
> > Reported-by: Dan Snyder <dansnyder@google.com>
> > Analyzed-by; Nick Bray <ncbray@google.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kernel/umip.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> > index 5a4b21389b1d..406ac01ce16d 100644
> > --- a/arch/x86/kernel/umip.c
> > +++ b/arch/x86/kernel/umip.c
> > @@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
> >  	if (!insn->modrm.nbytes)
> >  		return -EINVAL;
> >  
> > -	/* All the instructions of interest start with 0x0f. */
> > -	if (insn->opcode.bytes[0] != 0xf)
> > +	/* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
> > +	if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
> >  		return -EINVAL;
> >  
> >  	if (insn->opcode.bytes[1] == 0x1) {
> > -- 
> > 2.50.1.703.g449372360f-goog
> > 
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes
  2025-09-19 21:24     ` Sean Christopherson
@ 2025-09-20 10:07       ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2025-09-20 10:07 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Masami Hiramatsu, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
	linux-kernel, Dan Snyder

On Fri, Sep 19, 2025 at 02:24:58PM -0700, Sean Christopherson wrote:
> The "weirdness" I am referring to is purely speculative; what I was trying to say
> is that I deliberate went with a "bad" check on nbytes, i.e. it really should be
> "insn->opcode.nbytes == 2".  But I didn't want to risk breaking some bizarre
> userspace that happened to be relying on a quirk of the kernel's decoder (I
> haven't dug into the decoder, so I genuinely have/had no idea what all could
> happen).

Yeah, after yesterday, my todo list has one more item - to dig into the
decoder and see what's going on there.

For example, in this particular case, the decoder should report an error when
it cannot decode the insn instead of emulating a totally different insn...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-09-20 10:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08 17:23 [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
2025-08-08 17:23 ` [PATCH 1/3] x86/umip: Check that the instruction opcode is at least two bytes Sean Christopherson
2025-08-14 14:42   ` Peter Zijlstra
2025-09-19 18:16   ` Borislav Petkov
2025-09-19 21:24     ` Sean Christopherson
2025-09-20 10:07       ` Borislav Petkov
2025-09-19 19:55   ` [tip: x86/cpu] " tip-bot2 for Sean Christopherson
2025-08-08 17:23 ` [PATCH 2/3] x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases) Sean Christopherson
2025-08-14 14:42   ` Peter Zijlstra
2025-09-19 19:55   ` [tip: x86/cpu] " tip-bot2 for Sean Christopherson
2025-08-08 17:23 ` [PATCH 3/3] *** DO NOT MERGE *** x86/umip: Lazy person's KUnit test for UMIP emulation Sean Christopherson
2025-09-19 14:31 ` [PATCH 0/3] x86/umip: Fix UMIP insn decoder false positives Sean Christopherson
2025-09-19 14:46   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox