public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [Discussion] x86: Guest Support for APX
@ 2025-09-19 20:14 Chang S. Bae
  2025-09-19 22:13 ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: Chang S. Bae @ 2025-09-19 20:14 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm@vger.kernel.org

Dear KVM maintainers,

We'd like to seek clarification on how to approach guest support for a 
new feature. Specifically, this concerns Advanced Performance Extensions 
(APX). As you might notice, host support was merged in v6.16, and we are 
now working on the KVM side.

At first glance, guest enablement seemed straightforward: advertise 
CPUID, rely on the existing XSAVE infrastructure in the host, and ensure 
conflicting MPX are rejected.

Then, we've noticed your policy statements [1,2] during the discussion 
of supervisor CET guest support, which I think makes clear the 
expectation that a VM should be architecturally compatible before a 
feature is exposed to guests.

Since APX introduces new general-purpose registers (GPRs), legacy 
instructions are extended to access them, which may lead to associated 
VM exits. For example, MOV may now reference these registers in MMIO 
operations for emulated devices. The spec [3] lists other instructions 
that may similarly exit.

Now, interpreting your policy in this context, it seems that enabling 
APX for guests needs to support the full set of possible APX-induced exits.

We may proceed with posting an RFC version that emulates all of them and 
gather feedback. But as we internally discussed, we think it would be 
better to clarify the scope up front, if possible, to avoid unnecessary 
churn.

At the moment, we also noticed another interesing precedent case: 
MOVDIR64/MOVDIRI. These instructions can optimize MMIO operations by 
bypassing caches, yet KVM emulation does not support them [4]. It is 
unclear if this was a deliberate decision or simply something not 
implemented yet -- picking up the set [5]. If it was intentional, that 
suggests we may need to define a more selective approach to APX 
emulation as well.

In summary, we'd like to clarify:

   * Should we target complete emulation coverage for all APX-induced
     exits (from the start)?

   * Or is a narrower scope (e.g., only MOV) practically a considerable
     option, given the limited likelihood of other exits?

   * Alternatively, can we even consider a pragmatic path like MOVDIR* --
     supporting only when practically useful?

Thanks for your time and consideration. We'd appreciate your guidance on
this.

Chang

[1] Link: 
https://lore.kernel.org/all/2597a87b-1248-b8ce-ce60-94074bc67ea4@intel.com/

On 8/28/2023 2:00 PM, Dave Hansen wrote:
 > On 8/10/23 08:15, Paolo Bonzini wrote:
 >> On 8/10/23 16:29, Dave Hansen wrote:
 >>> What actual OSes need this support?
 >>
 >> I think Xen could use it when running nested.  But KVM cannot expose
 >> support for CET in CPUID, and at the same time fake support for
 >> MSR_IA32_PL{0,1,2}_SSP (e.g. inject a #GP if it's ever written to a
 >> nonzero value).
 >>
 >> I suppose we could invent our own paravirtualized CPUID bit for
 >> "supervisor IBT works but supervisor SHSTK doesn't".  Linux could check
 >> that but I don't think it's a good idea.
 >>
 >> So... do, or do not.  There is no try. :)
 >
 > Ahh, that makes sense. This is needed for implementing the
 > *architecture*, not because some OS actually wants to _do_ it.

[2] Link: https://lore.kernel.org/all/ZNUETFZK7K5zyr3X@google.com/

On 8/10/2023 8:37 AM, Sean Christopherson wrote:
 >
 > As Paolo alluded to, this is about KVM faithfully emulating the 
architecture.
 > There is no combination of CPUID bits that allows KVM to advertise 
SHSTK for
 > userspace without advertising SHSTK for supervisor.
 >
 > Whether or not there are any users in the short term is unfortunately 
irrelevant
 > from KVM's perspective.

[3] Architecture Specification for Intel APX: Table 3.10: Intel APX
Interactions with Instruction Execution Info or Exit Qualification
Link: https://cdrdv2.intel.com/v1/dl/getContent/784266

[4] The MOVDIR64 opcode is "66 0F 38 F8 ..." but opcode_table[] in
     emulate.c looks currently missing it:

         /* 0x60 - 0x67 */
         I(ImplicitOps | Stack | No64, em_pusha),
         I(ImplicitOps | Stack | No64, em_popa),
         N, MD(ModRM, &mode_dual_63),
         N, N, N, N,

[5] 
https://lore.kernel.org/lkml/1541483728-7826-1-git-send-email-jingqi.liu@intel.com/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Discussion] x86: Guest Support for APX
  2025-09-19 20:14 [Discussion] x86: Guest Support for APX Chang S. Bae
@ 2025-09-19 22:13 ` Paolo Bonzini
  2025-09-21 22:52   ` Chang S. Bae
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Bonzini @ 2025-09-19 22:13 UTC (permalink / raw)
  To: Chang S. Bae; +Cc: Sean Christopherson, kvm

On Fri, Sep 19, 2025, 22:16 Chang S. Bae <chang.seok.bae@intel.com> wrote:
> Dear KVM maintainers,
>
> Since APX introduces new general-purpose registers (GPRs), legacy
> instructions are extended to access them, which may lead to associated
> VM exits. For example, MOV may now reference these registers in MMIO
> operations for emulated devices. The spec [3] lists other instructions
> that may similarly exit.

You're right that gets very complicated quickly, while most cases of
MMIO emulation are for legacy devices and R16-R31 are unlikely to
appear in MMIO instructions for these legacy devices.

However, at least MOVs should be extended to support APX registers as
source or destination operands, and there should also be support for
base and index in the addresses. This means you have to parse REX2,
but EVEX shouldn't be needed as these instructions are in "legacy map
0" (aka one-byte).

At this point, singling out MOVs is not useful and you might as well
implement REX2 for all instructions.  EVEX adds a lot of extra cases
including three operand integer instructions and no flag update, but
REX2 is relatively simple.

> In summary, we'd like to clarify:
>
>    * Should we target complete emulation coverage for all APX-induced
>      exits (from the start)?
>
>    * Or is a narrower scope (e.g., only MOV) practically a considerable
>      option, given the limited likelihood of other exits?

See above. I hope it answers both questions.

>    * Alternatively, can we even consider a pragmatic path like MOVDIR* --
>      supporting only when practically useful?

I think pragmatic is fine, but in some cases too restrictive makes it
harder to track what is implemented and what isn't. Again, see the
above comment about implementing REX2 fully while limiting EVEX
support to the minimum (or hopefully leaving it out altogether).

> [4] The MOVDIR64 opcode is "66 0F 38 F8 ..." but opcode_table[] in
>      emulate.c looks currently missing it:
>
>          /* 0x60 - 0x67 */
>          I(ImplicitOps | Stack | No64, em_pusha),
>          I(ImplicitOps | Stack | No64, em_popa),
>          N, MD(ModRM, &mode_dual_63),
>          N, N, N, N,

0x66 is a prefix so you have to look at F8 in the table for the 0F 38
three-byte opcodes (opcode_map_0f_38) and add a new
three_byte_0f_38_f8 table.

MOVDIR* and many other instructions are not implemented because they
are pretty much never used with emulated (legacy) MMIO such as VGA
framebuffers. By the way MOVDIR* is not a REX2-accepted instruction,
so you would have to implement EVEX in order to support it for APX
registers.

Paolo


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Discussion] x86: Guest Support for APX
  2025-09-19 22:13 ` Paolo Bonzini
@ 2025-09-21 22:52   ` Chang S. Bae
  0 siblings, 0 replies; 3+ messages in thread
From: Chang S. Bae @ 2025-09-21 22:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, kvm

On 9/19/2025 3:13 PM, Paolo Bonzini wrote:
> 
> You're right that gets very complicated quickly, while most cases of
> MMIO emulation are for legacy devices and R16-R31 are unlikely to
> appear in MMIO instructions for these legacy devices.
> 
> However, at least MOVs should be extended to support APX registers as
> source or destination operands, and there should also be support for
> base and index in the addresses. This means you have to parse REX2,
> but EVEX shouldn't be needed as these instructions are in "legacy map
> 0" (aka one-byte).
> 
> At this point, singling out MOVs is not useful and you might as well
> implement REX2 for all instructions.  EVEX adds a lot of extra cases
> including three operand integer instructions and no flag update, but
> REX2 is relatively simple.
...>
> I think pragmatic is fine, but in some cases too restrictive makes it
> harder to track what is implemented and what isn't. Again, see the
> above comment about implementing REX2 fully while limiting EVEX
> support to the minimum (or hopefully leaving it out altogether).

Thanks for the guidance. This makes sense to me.

I think the high-level direction is clear now. I'll prepare and post an 
RFC series, once it's ready, to walk through the details.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-09-21 22:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 20:14 [Discussion] x86: Guest Support for APX Chang S. Bae
2025-09-19 22:13 ` Paolo Bonzini
2025-09-21 22:52   ` Chang S. Bae

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox