* Re: [PATCH kvm-unit-tests] xsave: add testcase for emulation of AVX instructions
2025-11-14 0:32 [PATCH kvm-unit-tests] xsave: add testcase for emulation of AVX instructions Paolo Bonzini
@ 2025-11-21 18:05 ` Sean Christopherson
0 siblings, 0 replies; 2+ messages in thread
From: Sean Christopherson @ 2025-11-21 18:05 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm, kbusch
On Thu, Nov 13, 2025, Paolo Bonzini wrote:
> Companion patch to the emulator changes in KVM. Funnily enough,
> no valid combination involving AVX was tried.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> x86/xsave.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/x86/xsave.c b/x86/xsave.c
> index cc8e3a0a..e6d15938 100644
> --- a/x86/xsave.c
> +++ b/x86/xsave.c
> @@ -15,6 +15,34 @@
> #define XSTATE_SSE 0x2
> #define XSTATE_YMM 0x4
>
> +char __attribute__((aligned(32))) v32_1[32] = {
> + 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
> + 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143,
> +};
> +char __attribute__((aligned(32))) v32_2[32] = { 0 };
> +
> +static __attribute__((target("avx"))) void
> +test_avx_fep(void)
Bizarre wrap.
> +{
> + asm volatile("vzeroall\n"
Why zero all registers? I can't think of any reason why VZEROALL is needed, and
clobbering everything for no apparent reason is weird/confusing.
> + KVM_FEP "vmovdqa v32_1, %%ymm0\n"
These need to use RIP-relative addressing, otherwise EFI builds fail.
> + KVM_FEP "vmovdqa %%ymm0, v32_2\n" : : :
> + "memory",
> + "%ymm0", "%ymm1", "%ymm2", "%ymm3", "%ymm4", "%ymm5", "%ymm6", "%ymm7",
> + "%ymm8", "%ymm9", "%ymm10", "%ymm11", "%ymm12", "%ymm13", "%ymm14", "%ymm15");
> +}
> +
> +static __attribute__((target("avx"))) void
> +test_avx(void)
> +{
> + asm volatile("vzeroall\n"
> + "vmovdqa v32_1, %%ymm0\n"
> + "vmovdqa %%ymm0, v32_2\n" : : :
We should also test reg=>reg, registers other than ymm0, and that emulated accesses
are correctly propagated to hardware. E.g. with macro shenanigans, it's not too
hard:
asm volatile(FEP "vmovdqa v32_1(%%rip), %%" #reg1 "\n" \
FEP "vmovdqa %%" #reg1 ", %%" #reg2 "\n" \
FEP "vmovdqa %%" #reg2 ", v32_2(%%rip)\n" \
"vmovdqa %%" #reg2 ", v32_3(%%rip)\n" \
::: "memory", #reg1, #reg2); \
> + "memory",
> + "%ymm0", "%ymm1", "%ymm2", "%ymm3", "%ymm4", "%ymm5", "%ymm6", "%ymm7",
> + "%ymm8", "%ymm9", "%ymm10", "%ymm11", "%ymm12", "%ymm13", "%ymm14", "%ymm15");
> +}
> +
> static void test_xsave(void)
> {
> unsigned long cr4;
> @@ -45,7 +73,22 @@ static void test_xsave(void)
> report(xsetbv_safe(XCR_XFEATURE_ENABLED_MASK, test_bits) == 0,
> "\t\txsetbv(XCR_XFEATURE_ENABLED_MASK, XSTATE_FP | XSTATE_SSE)");
> report(xgetbv_safe(XCR_XFEATURE_ENABLED_MASK, &xcr0) == 0,
> - " xgetbv(XCR_XFEATURE_ENABLED_MASK)");
> + "\t\txgetbv(XCR_XFEATURE_ENABLED_MASK)");
Ugh, this test is ancient and crusty. We really should have dedicated helpers
for accessing XCR0, it uses spaces instead of tabs, there's unnecessary code
duplication, and a handful of other minor issues.
I'll send a series with a pile of cleanups, and better validation of VMOVDQA
emulation.
^ permalink raw reply [flat|nested] 2+ messages in thread