Re: [PATCH 0/7] KVM: x86: APX reg prep work

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Chang S. Bae" <chang.seok.bae@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Kiryl Shutsemau <kas@kernel.org>, <kvm@vger.kernel.org>,
	<x86@kernel.org>, <linux-coco@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/7] KVM: x86: APX reg prep work
Date: Thu, 2 Apr 2026 17:05:26 -0700	[thread overview]
Message-ID: <476f4107-1f0a-4ab4-b1d9-d680fa44b70b@intel.com> (raw)
In-Reply-To: <ac72z1cfXnpUmkWv@google.com>

[-- Attachment #1: Type: text/plain, Size: 2641 bytes --]

On 4/2/2026 4:07 PM, Sean Christopherson wrote:
> On Wed, Mar 25, 2026, Chang S. Bae wrote:
>> On 3/12/2026 10:47 AM, Sean Christopherson wrote:
>>> On Thu, Mar 12, 2026, Chang S. Bae wrote:
>>>>
>>>> However, that is sort of what-if scenarios at best. The host kernel still
>>>> manages EGPR context switching through XSAVE. Saving EGPRs into regs[] would
>>>> introduce an oddity to synchronize between two buffers: regs[] and
>>>> gfpu->fpstate, which looks like unnecessary complexity.
>>
>> No, this looks ugly.
> 
> Sorry, you lost me.  What looks ugly?

Oh, this is against my comment above. Keeping regs[] <-> guest fpstate 
in sync will be unnecessarily complex without clear usage (continues below).

>> If guest EGPR state is saved in vcpu->arch.regs[], the APX area there isn't
>> necessary:
>>
>> When the KVM API exposes state in XSAVE format, the frontend can handle this
>> separately. Alongside uABI <-> guest fpstate copy functions, new copy
>> functions may deal with the state between uABI <-> VCPU cache.
>>
>> Further, one could think of exclusion as such:
>>
>> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>> index 76153dfb58c9..5404f9399eea 100644
>> --- a/arch/x86/kernel/fpu/xstate.c
>> +++ b/arch/x86/kernel/fpu/xstate.c
>> @@ -794,9 +794,10 @@ static u64 __init guest_default_mask(void)
>> {
>> 	/*
>> 	 * Exclude dynamic features, which require userspace opt-in even
>> -	 * for KVM guests.
>> +	 * for KVM guests, and APX as extended general-purpose register
>> +	 * states are saved in the KVM cache separately.
>> 	 */
>> -	return ~(u64)XFEATURE_MASK_USER_DYNAMIC;
>> +	return ~((u64)XFEATURE_MASK_USER_DYNAMIC | XFEATURE_MASK_APX);
>> }
>>
>> But this default bitmask feeds into the permission bits:
>>
>> 	fpu->guest_perm.__state_perm    = guest_default_cfg.features;
>> 	fpu->guest_perm.__state_size    = guest_default_cfg.size;
>>
>> This policy looks clear and sensible: permission is granted only if space is
>> reserved to save the state. If there is a strong desire to save memory, I
>> think it should go through a more thorough review to revisit this policy.
> 
> And I'm lost again.

Here I made myself pursuing the approach saving/restoring EGPRs via 
regs[] on VM entry/exit. Then a couple of follow-up questions:

   1. What about APX area in guest fpstate?
   2. How to support the state for KVM ABI?

It surely departs from the "XSAVE - the single source of truth" model. 
Then,

   Leave the APX area in guest fpstate unused.

   Copying APX state directly between regs[] and uABI to preserve XSAVE-
   based ABI like in the attached diff.

That's all I'm saying.

[-- Attachment #2: kvmapi-apx.diff --]
[-- Type: text/plain, Size: 4470 bytes --]

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index fffbf087937d..b3ab2ac827e6 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -59,6 +59,16 @@ void __init kvm_init_xstate_sizes(void)
 	}
 }
 
+u32 xstate_size(unsigned int xfeature)
+{
+	return xstate_sizes[xfeature].eax;
+}
+
+u32 xstate_offset(unsigned int xfeature)
+{
+	return xstate_sizes[xfeature].ebx;
+}
+
 u32 xstate_required_size(u64 xstate_bv, bool compacted)
 {
 	u32 ret = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 039b8e6f40ba..5ace99dd152b 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -64,6 +64,8 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 
 void __init kvm_init_xstate_sizes(void);
 u32 xstate_required_size(u64 xstate_bv, bool compacted);
+u32 xstate_size(unsigned int xfeature);
+u32 xstate_offset(unsigned int xfeature);
 
 int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu);
 int cpuid_query_maxguestphyaddr(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c1e1b3030786..1f064a32b8b7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -108,6 +108,12 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_host);
 #define emul_to_vcpu(ctxt) \
 	((struct kvm_vcpu *)(ctxt)->vcpu)
 
+#ifdef CONFIG_KVM_APX
+#define VCPU_EGPRS_PTR(vcpu)   &(vcpu)->arch.regs[VCPU_REGS_R16]
+#else
+#define VCPU_EGPRS_PTR(vcpu)   NULL
+#endif
+
 /* EFER defaults:
  * - enable syscall per default because its emulated by KVM
  * - enable LME and LMA per default on 64 bit KVM
@@ -5804,10 +5810,33 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static void kvm_copy_vcpu_regs_to_uabi(struct kvm_vcpu *vcpu, struct kvm_xsave *uabi_xsave)
+{
+	union fpregs_state *xstate = (union fpregs_state *)uabi_xsave->region;
+	void *uabi_apx = (void*)uabi_xsave->region + xstate_offset(XFEATURE_APX);
+	void *vcpu_egprs = VCPU_EGPRS_PTR(vcpu);
+
+	if (!vcpu_egprs)
+		return;
 
-static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
-					 u8 *state, unsigned int size)
+	memcpy(uabi_apx, vcpu_egprs, xstate_size(XFEATURE_APX));
+	xstate->xsave.header.xfeatures |= XFEATURE_MASK_APX;
+}
+
+static void kvm_copy_uabi_to_vcpu_regs(struct kvm_vcpu *vcpu, struct kvm_xsave *uabi_xsave)
 {
+	union fpregs_state *xstate = (union fpregs_state *)uabi_xsave->region;
+	void *uabi_apx = (void*)uabi_xsave->region + xstate_offset(XFEATURE_APX);
+	void *vcpu_egprs = VCPU_EGPRS_PTR(vcpu);
+
+	if (vcpu_egprs && xstate->xsave.header.xfeatures & XFEATURE_MASK_APX)
+		memcpy(vcpu_egprs, uabi_apx, xstate_size(XFEATURE_APX));
+}
+
+static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu, struct kvm_xsave *guest_xsave,
+					 unsigned int size)
+{
+
 	/*
 	 * Only copy state for features that are enabled for the guest.  The
 	 * state itself isn't problematic, but setting bits in the header for
@@ -5826,15 +5855,23 @@ static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
 	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
 		return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;
 
-	fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu, state, size,
+	/*
+	 * The generic XSAVE copy function zeros out areas not present in
+	 * guest fpstate. Those not in fpstate but in somewhere else,
+	 * like EGPRs, should be copied after this.
+	 */
+	fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu, guest_xsave->region, size,
 				       supported_xcr0, vcpu->arch.pkru);
+
+	kvm_copy_vcpu_regs_to_uabi(vcpu, guest_xsave);
+
 	return 0;
 }
 
 static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
 					struct kvm_xsave *guest_xsave)
 {
-	return kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
+	return kvm_vcpu_ioctl_x86_get_xsave2(vcpu, guest_xsave,
 					     sizeof(guest_xsave->region));
 }
 
@@ -5853,6 +5890,8 @@ static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
 	 */
 	xstate->xsave.header.xfeatures &= ~vcpu->arch.guest_fpu.fpstate->xfd;
 
+	kvm_copy_uabi_to_vcpu_regs(vcpu, guest_xsave);
+
 	return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
 					      guest_xsave->region,
 					      kvm_caps.supported_xcr0,
@@ -6464,7 +6503,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		if (!u.xsave)
 			break;
 
-		r = kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+		r = kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.xsave, size);
 		if (r < 0)
 			break;
 
-- 
2.51.0

next prev parent reply	other threads:[~2026-04-03  0:05 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11  0:33 [PATCH 0/7] KVM: x86: APX reg prep work Sean Christopherson
2026-03-11  0:33 ` [PATCH 1/7] KVM: x86: Add dedicated storage for guest RIP Sean Christopherson
2026-03-11  0:33 ` [PATCH 2/7] KVM: x86: Drop the "EX" part of "EXREG" to avoid collision with APX Sean Christopherson
2026-03-11 18:46   ` Paolo Bonzini
2026-03-11  0:33 ` [PATCH 3/7] KVM: nVMX: Do a bitwise-AND of regs_avail when switching active VMCS Sean Christopherson
2026-03-11  0:33 ` [PATCH 4/7] KVM: x86: Add wrapper APIs to reset dirty/available register masks Sean Christopherson
2026-03-11  2:03   ` Yosry Ahmed
2026-03-11 13:31     ` Sean Christopherson
2026-03-11 18:28       ` Yosry Ahmed
2026-03-11 18:50       ` Paolo Bonzini
2026-03-13  0:38         ` Sean Christopherson
2026-03-11  0:33 ` [PATCH 5/7] KVM: x86: Track available/dirty register masks as "unsigned long" values Sean Christopherson
2026-03-11  0:33 ` [PATCH 6/7] KVM: x86: Use a proper bitmap for tracking available/dirty registers Sean Christopherson
2026-03-11  0:33 ` [PATCH 7/7] *** DO NOT MERGE *** KVM: x86: Pretend that APX is supported on 64-bit kernels Sean Christopherson
2026-03-11 19:01 ` [PATCH 0/7] KVM: x86: APX reg prep work Paolo Bonzini
2026-03-12 16:34   ` Chang S. Bae
2026-03-12 17:47     ` Sean Christopherson
2026-03-12 18:11       ` Andrew Cooper
2026-03-12 18:29         ` Sean Christopherson
2026-03-12 18:33           ` Andrew Cooper
2026-03-25 18:28       ` Chang S. Bae
2026-04-02 23:07         ` Sean Christopherson
2026-04-03  0:05           ` Chang S. Bae [this message]
2026-04-02 23:19   ` Sean Christopherson
2026-04-03 16:03     ` Paolo Bonzini
2026-04-03 22:05       ` Chang S. Bae
2026-04-04  5:16         ` Paolo Bonzini
2026-04-06 15:28           ` Sean Christopherson
2026-04-06 21:41             ` Paolo Bonzini
2026-04-06 22:00               ` Sean Christopherson
2026-04-07  7:18                 ` Paolo Bonzini
2026-04-07 13:20                   ` Sean Christopherson
2026-04-03 16:07     ` Dave Hansen
2026-04-06 15:40       ` Sean Christopherson

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:fffbf087937 dfblob:b3ab2ac827e dfblob:039b8e6f40b
dfblob:5ace99dd152 dfblob:c1e1b303078 dfblob:1f064a32b8b )
 OR (
bs:"Re: [PATCH 0/7] KVM: x86: APX reg prep work" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=476f4107-1f0a-4ab4-b1d9-d680fa44b70b@intel.com \
    --to=chang.seok.bae@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.