The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kirill@shutemov.name>
To: David Laight <david.laight.linux@gmail.com>,
	 Sean Christopherson <seanjc@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	 Thomas Gleixner <tglx@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 x86@kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	 Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Kai Huang <kai.huang@intel.com>,
	 Xiaoyao Li <xiaoyao.li@intel.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	 Binbin Wu <binbin.wu@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>, Dan Williams <djbw@kernel.org>,
	 Borys Tsyrulnikov <tsyrulnikov.borys@gmail.com>,
	kvm@vger.kernel.org, linux-coco@lists.linux.dev,
	 linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH v5 2/3] x86/insn-eval: Add insn_assign_reg() helper
Date: Thu, 2 Jul 2026 16:30:04 +0100	[thread overview]
Message-ID: <akaCzNRGVy5Xr-bG@thinkstation> (raw)
In-Reply-To: <20260701180033.6e9c07aa@pumpkin>

On Wed, Jul 01, 2026 at 06:00:33PM +0100, David Laight wrote:
> On Wed, 1 Jul 2026 07:59:05 -0700
> Sean Christopherson <seanjc@google.com> wrote:
> 
> > On Wed, Jul 01, 2026, Kiryl Shutsemau wrote:
> > > From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
> > > 
> > > KVM's instruction emulator has a small helper, assign_register(), that
> > > writes a value into a sub-register with x86 partial-register-write
> > > semantics: 1- and 2-byte writes leave the upper bits of the destination
> > > untouched, 4-byte writes zero-extend to 64 bits, 8-byte writes overwrite
> > > the full register.
> > > 
> > > The TDX guest #VE handler needs the same logic for port I/O emulation
> > > to get 32-bit zero-extension right.  Rather than copy-pasting the
> > > helper, lift it to <asm/insn-eval.h> as insn_assign_reg() so both can
> > > use it.
> > > 
> > > Add <asm/insn.h> to the header's includes so it builds standalone in
> > > callers that have not pulled it in transitively.
> > > 
> > > No functional change.
> > > 
> > > Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
> > > Cc: stable@vger.kernel.org # prerequisite for the following 32-bit port I/O zero-extension fix
> > > ---
> > >  arch/x86/include/asm/insn-eval.h | 30 ++++++++++++++++++++++++++++++
> > >  arch/x86/kvm/emulate.c           | 26 ++++----------------------
> > >  2 files changed, 34 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
> > > index 4733e9064ee5..0c87759816d3 100644
> > > --- a/arch/x86/include/asm/insn-eval.h
> > > +++ b/arch/x86/include/asm/insn-eval.h
> > > @@ -9,6 +9,7 @@
> > >  #include <linux/compiler.h>
> > >  #include <linux/bug.h>
> > >  #include <linux/err.h>
> > > +#include <asm/insn.h>
> > >  #include <asm/ptrace.h>
> > >  
> > >  #define INSN_CODE_SEG_ADDR_SZ(params) ((params >> 4) & 0xf)
> > > @@ -46,4 +47,33 @@ enum insn_mmio_type insn_decode_mmio(struct insn *insn, int *bytes);
> > >  
> > >  bool insn_is_nop(struct insn *insn);
> > >  
> > > +/*
> > > + * Write @val into *@reg with x86 partial-register-write semantics: a 1-
> > > + * or 2-byte write leaves the upper bits of the destination untouched; a
> > > + * 4-byte write zero-extends to 64 bits (matching IN[BWL], MOV[BWL]  
> > 
> > The placement of the "(matching IN[BWL], MOV[BWL] etc.)" blurb is confusing.  I
> > *think* you're trying to say this behavior matches that of MOVB, MOVW, and MOVL
> > instruction mnemonics, but the blurb is buried in the snippet that specifically
> > describes the 4-byte write behavior.
> > 
> > FWIW, I think giving examples does more harm than good, because the behavior isn't
> > instruction specific, it's architectural behavior that applies to all writes to
> > GPRs, as defined in "3.4.1.1 General-Purpose Registers in 64-Bit Mode".  E.g. for
> > a MOV instruction that sign-extends a 32-bit immediate to a 64-bit registers, it's
> > not that the instruction is exempt from the normal GPR semenatics, it's that the
> > instruction performs a 64-bit write to the destination even though the source is
> > only 32 bits.
> > 
> > And the B/W/L terminology isn't architectural, it's AT&T syntax.

Agreed.  Dropped the IN[BWL]/MOV[BWL] examples and reworded the comment
to describe architectural GPR-write behaviour with a pointer to the SDM
section instead.  I also spelled out that @bytes is the width of the
write, not a property of the instruction, to cover the sign-extending
MOV case you raised.

> > E.g. trying
> > to encode "movl" with NASM yields "error: instruction expected, found `movl dword'".
> > Yes, the kernel uses AT&T syntax for assembly, but I think this helper should very
> > explicitly document that it's emulating architectural behavior.
> > 
> > > + * etc.); an 8-byte write overwrites the full register.
> > > + *
> > > + * @reg need not be 8-byte aligned: KVM's instruction emulator points
> > > + * into the middle of a register slot to address the high-byte
>                  ^ it isn't really the 'middle'.

Reworded to "offsets the pointer by one byte".

> > > + * registers (AH, CH, DH, BH).  Use narrow stores for the sub-word
> > > + * cases so that the access width matches @bytes.
> > > + */
> > > +static inline void insn_assign_reg(unsigned long *reg, u64 val, int bytes)
> > > +{
> > > +	switch (bytes) {
> > > +	case 1:
> > > +		*(u8 *)reg = (u8)val;
> > > +		break;
> > > +	case 2:
> > > +		*(u16 *)reg = (u16)val;
> > > +		break;
> > > +	case 4:
> > > +		*reg = (u32)val;  
> > 
> > IMO, it's worth keeping a short comment here, because even with the explanation
> > above, I suspect most people will think the code is buggy.  E.g.
> > 
> > 		/* As above, zero-extend 4-byte writes on 64-bit CPUs. */
> > 		*reg = (u32)val;

Added on the 4-byte case, slightly reworded.

> Or be even more specific and use '& 0xffffffff' rather than a cast.
> Particularly since the casts of the RHS in the byte/short cases aren't
> needed at all.

I'd rather keep the body exactly as KVM has it today.  This is now a
straight move + rename with no functional change, and the v4 attempt to
rewrite it with arithmetic is precisely what introduced the AH/CH/DH/BH
clobber Sashiko flagged.  Tidying the casts turns it back into a rewrite
and diverges from the form KVM has shipped for years.  Feel free to
submit a separate cleanup on top if you feel strongly.

Updated patch below; I'll fold it into v6.

-- >8 --
Subject: [PATCH] x86/insn-eval: Move assign_register() out of KVM as insn_assign_reg()

KVM's instruction emulator has a small helper, assign_register(), that
writes a value into a register following the x86 rules for writes to
general-purpose registers: an 8- or 16-bit write leaves the rest of the
register untouched, a 32-bit write zero-extends the result to 64 bits,
and a 64-bit write replaces the whole register.

The TDX guest #VE handler needs the same logic for port I/O emulation
to get 32-bit zero-extension right.  Rather than add a third copy of
the same switch, move the helper verbatim to <asm/insn-eval.h>, rename
it to insn_assign_reg(), and route KVM's callers through it.

Add <asm/insn.h> to the header's includes so it builds standalone in
callers that have not pulled it in transitively.

No functional change.

Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Cc: stable@vger.kernel.org # prerequisite for the following 32-bit port I/O zero-extension fix
---
 arch/x86/include/asm/insn-eval.h | 36 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/emulate.c           | 26 ++++-------------------
 2 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 4733e9064ee5..ae05647a0afb 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -9,6 +9,7 @@
 #include <linux/compiler.h>
 #include <linux/bug.h>
 #include <linux/err.h>
+#include <asm/insn.h>
 #include <asm/ptrace.h>

 #define INSN_CODE_SEG_ADDR_SZ(params) ((params >> 4) & 0xf)
@@ -46,4 +47,39 @@ enum insn_mmio_type insn_decode_mmio(struct insn *insn, int *bytes);

 bool insn_is_nop(struct insn *insn);

+/*
+ * Write @val into *@reg following the x86 rules for writes to
+ * general-purpose registers (Intel SDM Vol. 1, "General-Purpose
+ * Registers in 64-Bit Mode"): an 8- or 16-bit write leaves the rest of
+ * the register untouched, a 32-bit write zero-extends the result into
+ * the upper 32 bits, and a 64-bit write replaces the whole register.
+ *
+ * @bytes is the width of the write, not a property of the instruction:
+ * an instruction that, say, sign-extends a 32-bit immediate into a
+ * 64-bit register does a 64-bit write here.
+ *
+ * @reg need not be 8-byte aligned: KVM's instruction emulator offsets
+ * the pointer by one byte to address the high-byte registers (AH, CH,
+ * DH, BH).  Use narrow stores for the sub-word cases so the access
+ * width matches @bytes and the adjacent bytes are left alone.
+ */
+static inline void insn_assign_reg(unsigned long *reg, u64 val, int bytes)
+{
+	switch (bytes) {
+	case 1:
+		*(u8 *)reg = (u8)val;
+		break;
+	case 2:
+		*(u16 *)reg = (u16)val;
+		break;
+	case 4:
+		/* A 32-bit write zero-extends into the upper 32 bits. */
+		*reg = (u32)val;
+		break;
+	case 8:
+		*reg = val;
+		break;
+	}
+}
+
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index b566ab5c7515..c6dcb5ac48af 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -24,6 +24,7 @@
 #include "kvm_emulate.h"
 #include <linux/stringify.h>
 #include <asm/debugreg.h>
+#include <asm/insn-eval.h>
 #include <asm/nospec-branch.h>
 #include <asm/ibt.h>
 #include <asm/text-patching.h>
@@ -439,25 +440,6 @@ static void assign_masked(ulong *dest, ulong src, ulong mask)
 	*dest = (*dest & ~mask) | (src & mask);
 }

-static void assign_register(unsigned long *reg, u64 val, int bytes)
-{
-	/* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */
-	switch (bytes) {
-	case 1:
-		*(u8 *)reg = (u8)val;
-		break;
-	case 2:
-		*(u16 *)reg = (u16)val;
-		break;
-	case 4:
-		*reg = (u32)val;
-		break;	/* 64b: zero-extend */
-	case 8:
-		*reg = val;
-		break;
-	}
-}
-
 static inline unsigned long ad_mask(struct x86_emulate_ctxt *ctxt)
 {
 	return (1UL << (ctxt->ad_bytes << 3)) - 1;
@@ -505,7 +487,7 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, int reg, int inc)
 {
 	ulong *preg = reg_rmw(ctxt, reg);

-	assign_register(preg, *preg + inc, ctxt->ad_bytes);
+	insn_assign_reg(preg, *preg + inc, ctxt->ad_bytes);
 }

 static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
@@ -1767,7 +1749,7 @@ static int load_segment_descriptor(struct x86_emulate_ctxt *ctxt,

 static void write_register_operand(struct operand *op)
 {
-	return assign_register(op->addr.reg, op->val, op->bytes);
+	return insn_assign_reg(op->addr.reg, op->val, op->bytes);
 }

 static int writeback(struct x86_emulate_ctxt *ctxt, struct operand *op)
@@ -2008,7 +1990,7 @@ static int em_popa(struct x86_emulate_ctxt *ctxt)
 		rc = emulate_pop(ctxt, &val, ctxt->op_bytes);
 		if (rc != X86EMUL_CONTINUE)
 			break;
-		assign_register(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
+		insn_assign_reg(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
 		--reg;
 	}
 	return rc;

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

  reply	other threads:[~2026-07-02 15:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01 11:05 [PATCH v5 0/3] x86/tdx: Fix port I/O handling bugs Kiryl Shutsemau
2026-07-01 11:05 ` [PATCH v5 1/3] x86/tdx: Fix off-by-one in port I/O handling Kiryl Shutsemau
2026-07-01 11:05 ` [PATCH v5 2/3] x86/insn-eval: Add insn_assign_reg() helper Kiryl Shutsemau
2026-07-01 14:59   ` Sean Christopherson
2026-07-01 17:00     ` David Laight
2026-07-02 15:30       ` Kiryl Shutsemau [this message]
2026-07-01 11:05 ` [PATCH v5 3/3] x86/tdx: Fix zero-extension for 32-bit port I/O Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akaCzNRGVy5Xr-bG@thinkstation \
    --to=kirill@shutemov.name \
    --cc=ak@linux.intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david.laight.linux@gmail.com \
    --cc=djbw@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@kernel.org \
    --cc=tsyrulnikov.borys@gmail.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox