qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* boot failure on top of current git
@ 2025-07-16 14:44 Paolo Abeni
  2025-07-16 15:22 ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2025-07-16 14:44 UTC (permalink / raw)
  To: Paolo Bonzini, Zhao Liu; +Cc: qemu-devel

Hi,

I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
commit c079d3a31e.

My local conf is:

/configure  --enable-kvm --enable-lto --target-list=x86_64-softmmu
enable-numa --enable-curses --enable-vhost-net

and the qemu command line:

/build/qemu-system-x86_64  -smp 4 -enable-kvm -m 4G \
 -hda "rhel9.7-20250506.3-image.qcow2" \
 -netdev
tap,id=nd0,vhostforce=on,vhost=on,ifname=tun0,script=no,downscript=no \
 -device virtio-net-pci,netdev=nd0 \
 -chardev stdio,id=char0,mux=on,logfile=serial.log,signal=off \
 -serial chardev:char0 -mon chardev=char0 \
 -cpu host \
 -D qemu.log \
 -name "raw qemu"

it core dumps with the following error:

qemu-system-x86_64: ../target/i386/kvm/kvm-cpu.c:149:
kvm_cpu_xsave_init: Assertion `esa->size == eax' failed.

Dumbly and blindly reverting 29f1ba338baf60a9e455b6fdc37489ca1efe25aa
and 5f158abef44c7e0945fc5f76715ef135a9bf9bd2 solves the problem for me.

Is that a known problem?

Paolo



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 14:44 boot failure on top of current git Paolo Abeni
@ 2025-07-16 15:22 ` Paolo Bonzini
  2025-07-16 15:26   ` Paolo Abeni
  2025-07-16 16:13   ` Zhao Liu
  0 siblings, 2 replies; 8+ messages in thread
From: Paolo Bonzini @ 2025-07-16 15:22 UTC (permalink / raw)
  To: Paolo Abeni, Zhao Liu; +Cc: qemu-devel

On 7/16/25 16:44, Paolo Abeni wrote:
> Hi,
> 
> I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
> commit c079d3a31e.

No and I cannot reproduce it.

What host is it (processor) and kernel version?

Paolo

> it core dumps with the following error:
> 
> qemu-system-x86_64: ../target/i386/kvm/kvm-cpu.c:149:
> kvm_cpu_xsave_init: Assertion `esa->size == eax' failed.
> 
> Dumbly and blindly reverting 29f1ba338baf60a9e455b6fdc37489ca1efe25aa
> and 5f158abef44c7e0945fc5f76715ef135a9bf9bd2 solves the problem for me.
> 
> Is that a known problem?
> 
> Paolo
> 
> 
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 15:22 ` Paolo Bonzini
@ 2025-07-16 15:26   ` Paolo Abeni
  2025-07-16 15:31     ` Paolo Abeni
  2025-07-16 15:39     ` Paolo Bonzini
  2025-07-16 16:13   ` Zhao Liu
  1 sibling, 2 replies; 8+ messages in thread
From: Paolo Abeni @ 2025-07-16 15:26 UTC (permalink / raw)
  To: Paolo Bonzini, Zhao Liu; +Cc: qemu-devel

On 7/16/25 5:22 PM, Paolo Bonzini wrote:
> On 7/16/25 16:44, Paolo Abeni wrote:
>> I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
>> commit c079d3a31e.
> 
> No and I cannot reproduce it.
> 
> What host is it (processor) and kernel version?

Host CPU is AMD EPYC 7302 16-Core Processor, the running hypervisor
kernel is ~current net-next (v6.16.0-rc5 + plus net-next new features
for 6.17)

/P



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 15:26   ` Paolo Abeni
@ 2025-07-16 15:31     ` Paolo Abeni
  2025-07-16 15:39     ` Paolo Bonzini
  1 sibling, 0 replies; 8+ messages in thread
From: Paolo Abeni @ 2025-07-16 15:31 UTC (permalink / raw)
  To: Paolo Bonzini, Zhao Liu; +Cc: qemu-devel

On 7/16/25 5:26 PM, Paolo Abeni wrote:
> On 7/16/25 5:22 PM, Paolo Bonzini wrote:
>> On 7/16/25 16:44, Paolo Abeni wrote:
>>> I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
>>> commit c079d3a31e.
>>
>> No and I cannot reproduce it.
>>
>> What host is it (processor) and kernel version?
> 
> Host CPU is AMD EPYC 7302 16-Core Processor, the running hypervisor
> kernel is ~current net-next (v6.16.0-rc5 + plus net-next new features
> for 6.17)

I'm sorry, I should have waited a bit and added than I can observe the
same failure even while the hypervisor is running kernel
5.14.0-576.el9.x86_64.

/P



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 15:26   ` Paolo Abeni
  2025-07-16 15:31     ` Paolo Abeni
@ 2025-07-16 15:39     ` Paolo Bonzini
  2025-07-16 16:04       ` Paolo Abeni
  1 sibling, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2025-07-16 15:39 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Zhao Liu, qemu-devel

On Wed, Jul 16, 2025 at 5:26 PM Paolo Abeni <pabeni@redhat.com> wrote:
> On 7/16/25 5:22 PM, Paolo Bonzini wrote:
> > On 7/16/25 16:44, Paolo Abeni wrote:
> >> I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
> >> commit c079d3a31e.
> >
> > No and I cannot reproduce it.
> >
> > What host is it (processor) and kernel version?
>
> Host CPU is AMD EPYC 7302 16-Core Processor, the running hypervisor
> kernel is ~current net-next (v6.16.0-rc5 + plus net-next new features
> for 6.17)

Hmm I have AMD EPYC 7313. I have a 6.15.4 kernel but I will check the
one you gave in the other message. Can you check if

  ./qemu-system-x86_64 -cpu host -accel kvm -smp 4

is enough to reproduce or a real guest is needed?

Paolo



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 15:39     ` Paolo Bonzini
@ 2025-07-16 16:04       ` Paolo Abeni
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Abeni @ 2025-07-16 16:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Zhao Liu, qemu-devel

On 7/16/25 5:39 PM, Paolo Bonzini wrote:
> On Wed, Jul 16, 2025 at 5:26 PM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 7/16/25 5:22 PM, Paolo Bonzini wrote:
>>> On 7/16/25 16:44, Paolo Abeni wrote:
>>>> I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
>>>> commit c079d3a31e.
>>>
>>> No and I cannot reproduce it.
>>>
>>> What host is it (processor) and kernel version?
>>
>> Host CPU is AMD EPYC 7302 16-Core Processor, the running hypervisor
>> kernel is ~current net-next (v6.16.0-rc5 + plus net-next new features
>> for 6.17)
> 
> Hmm I have AMD EPYC 7313. I have a 6.15.4 kernel but I will check the
> one you gave in the other message. Can you check if
> 
>   ./qemu-system-x86_64 -cpu host -accel kvm -smp 4
> 
> is enough to reproduce or a real guest is needed?

Yes, I get the core dump with the above:

# ./build/qemu-system-x86_64 -cpu host -accel kvm -smp 4
qemu-system-x86_64: ../target/i386/kvm/kvm-cpu.c:149:
kvm_cpu_xsave_init: Assertion `esa->size == eax' failed.
Aborted (core dumped)

/P



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 16:13   ` Zhao Liu
@ 2025-07-16 16:09     ` Paolo Abeni
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Abeni @ 2025-07-16 16:09 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini; +Cc: qemu-devel

On 7/16/25 6:13 PM, Zhao Liu wrote:
> On Wed, Jul 16, 2025 at 05:22:46PM +0200, Paolo Bonzini wrote:
>> Date: Wed, 16 Jul 2025 17:22:46 +0200
>> From: Paolo Bonzini <pbonzini@redhat.com>
>> Subject: Re: boot failure on top of current git
>>
>> On 7/16/25 16:44, Paolo Abeni wrote:
>>> Hi,
>>>
>>> I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
>>> commit c079d3a31e.
>>
>> No and I cannot reproduce it.
>>
>> What host is it (processor) and kernel version?
>>
>> Paolo
> 
> It sounds like x86_ext_save_areas[] wasn't initialized correctly.
> 
> I just checked the related logic, in the previous QEMU, for x86_cpu_post_initfn(),
> it initialized x86_ext_save_areas[] first, then called accel_cpu_instance_init(),
> so that KVM's xsave assertion didn't complain.
> 
> But now, when we move accel_cpu_instance_init() to x86_cpu_initfn(), KVM
> checks x86_ext_save_areas[] before x86_ext_save_areas[] initialization.
> 
> I understand, we should initialize x86_ext_save_areas[] in
> x86_cpu_initfn() as well. Maybe we need something like this:
> 
> ---
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index da7d8dca633e..c8fccabeee71 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -9619,6 +9619,16 @@ static void x86_cpu_register_feature_bit_props(X86CPUClass *xcc,
>  }
> 
>  static void x86_cpu_post_initfn(Object *obj)
> +{
> +#ifndef CONFIG_USER_ONLY
> +    if (current_machine && current_machine->cgs) {
> +        x86_confidential_guest_cpu_instance_init(
> +            X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
> +    }
> +#endif
> +}
> +
> +static void x86_cpu_init_xsave(void)
>  {
>      static bool first = true;
>      uint64_t supported_xcr0;
> @@ -9639,13 +9649,6 @@ static void x86_cpu_post_initfn(Object *obj)
>              }
>          }
>      }
> -
> -#ifndef CONFIG_USER_ONLY
> -    if (current_machine && current_machine->cgs) {
> -        x86_confidential_guest_cpu_instance_init(
> -            X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
> -    }
> -#endif
>  }
> 
>  static void x86_cpu_init_default_topo(X86CPU *cpu)
> @@ -9715,6 +9718,7 @@ static void x86_cpu_initfn(Object *obj)
>          x86_cpu_load_model(cpu, xcc->model);
>      }
> 
> +    x86_cpu_init_xsave();
>      accel_cpu_instance_init(CPU(obj));
>  }

FWIW, I can boot successfully my VM on top of c079d3a31e plus the above
patch.

If the above turns into a formal patch feel free to add:

Tested-by: Paolo Abeni <pabeni@redhat.com>

Thanks,

Paolo



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: boot failure on top of current git
  2025-07-16 15:22 ` Paolo Bonzini
  2025-07-16 15:26   ` Paolo Abeni
@ 2025-07-16 16:13   ` Zhao Liu
  2025-07-16 16:09     ` Paolo Abeni
  1 sibling, 1 reply; 8+ messages in thread
From: Zhao Liu @ 2025-07-16 16:13 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Paolo Abeni, qemu-devel

On Wed, Jul 16, 2025 at 05:22:46PM +0200, Paolo Bonzini wrote:
> Date: Wed, 16 Jul 2025 17:22:46 +0200
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: Re: boot failure on top of current git
> 
> On 7/16/25 16:44, Paolo Abeni wrote:
> > Hi,
> > 
> > I'm observing boot failure for a rhel-9.7 VM. I'm using qemu git tree at
> > commit c079d3a31e.
> 
> No and I cannot reproduce it.
> 
> What host is it (processor) and kernel version?
> 
> Paolo

It sounds like x86_ext_save_areas[] wasn't initialized correctly.

I just checked the related logic, in the previous QEMU, for x86_cpu_post_initfn(),
it initialized x86_ext_save_areas[] first, then called accel_cpu_instance_init(),
so that KVM's xsave assertion didn't complain.

But now, when we move accel_cpu_instance_init() to x86_cpu_initfn(), KVM
checks x86_ext_save_areas[] before x86_ext_save_areas[] initialization.

I understand, we should initialize x86_ext_save_areas[] in
x86_cpu_initfn() as well. Maybe we need something like this:

---
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index da7d8dca633e..c8fccabeee71 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -9619,6 +9619,16 @@ static void x86_cpu_register_feature_bit_props(X86CPUClass *xcc,
 }

 static void x86_cpu_post_initfn(Object *obj)
+{
+#ifndef CONFIG_USER_ONLY
+    if (current_machine && current_machine->cgs) {
+        x86_confidential_guest_cpu_instance_init(
+            X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
+    }
+#endif
+}
+
+static void x86_cpu_init_xsave(void)
 {
     static bool first = true;
     uint64_t supported_xcr0;
@@ -9639,13 +9649,6 @@ static void x86_cpu_post_initfn(Object *obj)
             }
         }
     }
-
-#ifndef CONFIG_USER_ONLY
-    if (current_machine && current_machine->cgs) {
-        x86_confidential_guest_cpu_instance_init(
-            X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
-    }
-#endif
 }

 static void x86_cpu_init_default_topo(X86CPU *cpu)
@@ -9715,6 +9718,7 @@ static void x86_cpu_initfn(Object *obj)
         x86_cpu_load_model(cpu, xcc->model);
     }

+    x86_cpu_init_xsave();
     accel_cpu_instance_init(CPU(obj));
 }




^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-16 16:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 14:44 boot failure on top of current git Paolo Abeni
2025-07-16 15:22 ` Paolo Bonzini
2025-07-16 15:26   ` Paolo Abeni
2025-07-16 15:31     ` Paolo Abeni
2025-07-16 15:39     ` Paolo Bonzini
2025-07-16 16:04       ` Paolo Abeni
2025-07-16 16:13   ` Zhao Liu
2025-07-16 16:09     ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).