public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure
@ 2026-03-10 17:06 Dongli Zhang
  2026-03-10 17:06 ` [PATCH v2 1/2] system/runstate: add runstate transition notifier Dongli Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dongli Zhang @ 2026-03-10 17:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, philmd, kraxel, joe.jin

During system reset, and only during system reset, QEMU updates the
"bootorder" and "bios-geometry" entries in fw_cfg based on the contents of
"fw_boot_order" and "fw_lchs". After the guest VM boots, the firmware
(e.g., SeaBIOS) can read the boot order from fw_cfg and boot from the disk
at the top of the list.

The reset handler fw_cfg_machine_reset() is invoked either implicitly
during instance creation or explicitly by the user via HMP/QMP.

However, users may attach or detach disks while the VM is in the prelaunch
state. Because there is no implicit reset when transitioning from prelaunch
to running, the "bootorder" and "bios-geometry" data in fw_cfg can become
stale. As a result, the firmware may be unable to locate the correct disk
to boot from.

Here is an example that demonstrates the bug.

1. Create a QEMU instance with a virtio-scsi HBA and keep it in the
prelaunch state. Use SeaBIOS rather than UEFI.

-device virtio-scsi-pci,id=scsi0,num_queues=4 \
-S \

2. First, attach the boot disk, then attach the secondary disk.

(qemu) drive_add 0 file=boot.qcow2,if=none,id=drive0
(qemu) device_add scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=1,bootindex=1
(qemu) drive_add 0 file=secondary.qcow2,if=none,id=drive1
(qemu) device_add scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=0,lun=2,bootindex=-1

3. Start the VM from the prelaunch state. Because the "bootorder" and
"bios-geometry" data in fw_cfg is stale, SeaBIOS attempts to boot from the
secondary disk only once and then stops. As a result, the VM fails to boot.

One possible workaround is to require QEMU users to explicitly issue a
system_reset before starting a guest VM from the prelaunch state, if any
disks have been attached or detached.

Another option is to address the issue in SeaBIOS. Nowadays, SeaBIOS
attempts to boot from only a single disk. We could enhance SeaBIOS to try
multiple disks in order until boot succeeds.

Another option is to update "bootorder" and "bios-geometry" everywhere
disks are attached or detached. This may require identifying the relevant
functions across multiple device types, such as SCSI, NVMe, virtio-blk, and
IDE.

This commit fixes the issue in QEMU by ensuring that "bootorder" and
"bios-geometry" are always updated when QEMU transitions from the prelaunch
state to running.

According to runstate_transitions_def[], RUN_STATE_PRELAUNCH is allowed to
transition to four states. Only the transition to RUN_STATE_RUNNING
requires updating "bootorder" and "bios-geometry".

v1: https://lore.kernel.org/qemu-devel/20260225220556.11049-1-dongli.zhang@oracle.com

v1 -> v2:
  - Add new runstate tranisition notifier to track transition from
    prelaunch to running.


Dongli Zhang (2)
  system/runstate: add runstate transition notifier
  hw/nvram/fw_cfg: update bootorder and bios-geometry before launching

 hw/nvram/fw_cfg.c         | 31 +++++++++++++++++++++++++++++--
 include/hw/nvram/fw_cfg.h |  1 +
 include/system/runstate.h |  8 ++++++++
 system/runstate.c         | 25 +++++++++++++++++++++++++
 4 files changed, 63 insertions(+), 2 deletions(-)

Thank you very much!

Dongli Zhang



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] system/runstate: add runstate transition notifier
  2026-03-10 17:06 [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Dongli Zhang
@ 2026-03-10 17:06 ` Dongli Zhang
  2026-03-10 17:06 ` [PATCH v2 2/2] hw/nvram/fw_cfg: update bootorder and bios-geometry before launching Dongli Zhang
  2026-03-11 11:34 ` [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Gerd Hoffmann
  2 siblings, 0 replies; 6+ messages in thread
From: Dongli Zhang @ 2026-03-10 17:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, philmd, kraxel, joe.jin

Introduce the notifier list that fires whenever runstate_set() changes the
runstate. Any components can register to observe the old and new runstate
values before the switch completes.

For instance, an upcoming change will use this hook to refresh the fw_cfg
'bootorder' and 'bios-geometry' entries when the instance leaves
RUN_STATE_PRELAUNCH for RUN_STATE_RUNING.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 include/system/runstate.h |  8 ++++++++
 system/runstate.c         | 25 +++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/include/system/runstate.h b/include/system/runstate.h
index 929379adae..de812ec320 100644
--- a/include/system/runstate.h
+++ b/include/system/runstate.h
@@ -11,6 +11,14 @@ bool runstate_is_running(void);
 bool runstate_needs_reset(void);
 void runstate_replay_enable(void);
 
+typedef struct VMRunStateTransition {
+    RunState old_state;
+    RunState new_state;
+} VMRunStateTransition;
+
+void qemu_add_runstate_transition_notifier(Notifier *notifier);
+void qemu_remove_runstate_transition_notifier(Notifier *notifier);
+
 typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
 typedef int VMChangeStateHandlerWithRet(void *opaque, bool running, RunState state);
 
diff --git a/system/runstate.c b/system/runstate.c
index eca722b43c..01b76463d7 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -65,6 +65,8 @@
 
 static NotifierList exit_notifiers =
     NOTIFIER_LIST_INITIALIZER(exit_notifiers);
+static NotifierList runstate_transition_notifiers =
+    NOTIFIER_LIST_INITIALIZER(runstate_transition_notifiers);
 
 static RunState current_run_state = RUN_STATE_PRELAUNCH;
 
@@ -226,6 +228,17 @@ static void runstate_init(void)
     qemu_mutex_init(&vmstop_lock);
 }
 
+static void runstate_transition_notify(RunState old_state,
+                                       RunState new_state)
+{
+    VMRunStateTransition transition = {
+        .old_state = old_state,
+        .new_state = new_state,
+    };
+
+    notifier_list_notify(&runstate_transition_notifiers, &transition);
+}
+
 /* This function will abort() on invalid state transitions */
 void runstate_set(RunState new_state)
 {
@@ -245,6 +258,8 @@ void runstate_set(RunState new_state)
         abort();
     }
 
+    runstate_transition_notify(current_run_state, new_state);
+
     current_run_state = new_state;
 }
 
@@ -403,6 +418,16 @@ int vm_state_notify(bool running, RunState state)
     return ret;
 }
 
+void qemu_add_runstate_transition_notifier(Notifier *notifier)
+{
+    notifier_list_add(&runstate_transition_notifiers, notifier);
+}
+
+void qemu_remove_runstate_transition_notifier(Notifier *notifier)
+{
+    notifier_remove(notifier);
+}
+
 static ShutdownCause reset_requested;
 static ShutdownCause shutdown_requested;
 static int shutdown_exit_code = EXIT_SUCCESS;
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] hw/nvram/fw_cfg: update bootorder and bios-geometry before launching
  2026-03-10 17:06 [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Dongli Zhang
  2026-03-10 17:06 ` [PATCH v2 1/2] system/runstate: add runstate transition notifier Dongli Zhang
@ 2026-03-10 17:06 ` Dongli Zhang
  2026-03-11 11:34 ` [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Gerd Hoffmann
  2 siblings, 0 replies; 6+ messages in thread
From: Dongli Zhang @ 2026-03-10 17:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, philmd, kraxel, joe.jin

QEMU maintains disk boot order and geometry information in the lists
"fw_boot_order" and "fw_lchs". For example, boot order is derived from each
device's bootindex value.

During system reset, and only during system reset, QEMU updates the
"bootorder" and "bios-geometry" entries in fw_cfg based on the contents of
"fw_boot_order" and "fw_lchs". After the guest VM boots, the firmware
(e.g., SeaBIOS) can read the boot order from fw_cfg and boot from the disk
at the top of the list.

The reset handler fw_cfg_machine_reset() is invoked either implicitly
during instance creation or explicitly by the user via HMP/QMP.

However, users may attach or detach disks while the VM is in the prelaunch
state. Because there is no implicit reset when transitioning from prelaunch
to running, the "bootorder" and "bios-geometry" data in fw_cfg can become
stale. As a result, the firmware may be unable to locate the correct disk
to boot from.

Here is an example that demonstrates the bug.

1. Create a QEMU instance with a virtio-scsi HBA and keep it in the
prelaunch state. Use SeaBIOS rather than UEFI.

-device virtio-scsi-pci,id=scsi0,num_queues=4 \
-S \

2. First, attach the boot disk, then attach the secondary disk.

(qemu) drive_add 0 file=boot.qcow2,if=none,id=drive0
(qemu) device_add scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=1,bootindex=1
(qemu) drive_add 0 file=secondary.qcow2,if=none,id=drive1
(qemu) device_add scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=0,lun=2,bootindex=-1

3. Start the VM from the prelaunch state. Because the "bootorder" and
"bios-geometry" data in fw_cfg is stale, SeaBIOS attempts to boot from the
secondary disk only once and then stops. As a result, the VM fails to boot.

One possible workaround is to require QEMU users to explicitly issue a
system_reset before starting a guest VM from the prelaunch state, if any
disks have been attached or detached.

Another option is to address the issue in SeaBIOS. Nowadays, SeaBIOS
attempts to boot from only a single disk. We could enhance SeaBIOS to try
multiple disks in order until boot succeeds.

Another option is to update "bootorder" and "bios-geometry" everywhere
disks are attached or detached. This may require identifying the relevant
functions across multiple device types, such as SCSI, NVMe, virtio-blk, and
IDE.

This commit fixes the issue in QEMU by ensuring that "bootorder" and
"bios-geometry" are always updated when QEMU transitions from the prelaunch
state to running.

According to runstate_transitions_def[], RUN_STATE_PRELAUNCH is allowed to
transition to four states. Only the transition to RUN_STATE_RUNNING
requires updating "bootorder" and "bios-geometry".

Co-developed-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
v1 -> v2:
  - Add new runstate tranisition notifier to track transition from
    prelaunch to running

 hw/nvram/fw_cfg.c         | 31 +++++++++++++++++++++++++++++--
 include/hw/nvram/fw_cfg.h |  1 +
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 1d7d835421..5154d77028 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -27,6 +27,7 @@
 #include "system/system.h"
 #include "system/dma.h"
 #include "system/reset.h"
+#include "system/runstate.h"
 #include "system/address-spaces.h"
 #include "hw/core/boards.h"
 #include "hw/nvram/fw_cfg.h"
@@ -965,9 +966,8 @@ bool fw_cfg_add_file_from_generator(FWCfgState *s,
     return true;
 }
 
-static void fw_cfg_machine_reset(void *opaque)
+static void __fw_cfg_machine_reset(FWCfgState *s)
 {
-    FWCfgState *s = opaque;
     void *ptr;
     size_t len;
     char *buf;
@@ -981,10 +981,37 @@ static void fw_cfg_machine_reset(void *opaque)
     g_free(ptr);
 }
 
+static void fw_cfg_machine_reset(void *opaque)
+{
+    FWCfgState *s = opaque;
+
+    __fw_cfg_machine_reset(s);
+}
+
+static void fw_cfg_runstate_transition(Notifier *notifier, void *data)
+{
+    FWCfgState *s = container_of(notifier, FWCfgState,
+                                 runstate_transition);
+    const VMRunStateTransition *transition = data;
+
+    /*
+     * According to runstate_transitions_def[], RUN_STATE_PRELAUNCH is
+     * allowed to transition to four states. Only the transition to
+     * RUN_STATE_RUNNING requires updating "bootorder" and "bios-geometry".
+     */
+    if (transition->old_state == RUN_STATE_PRELAUNCH &&
+        transition->new_state == RUN_STATE_RUNNING) {
+        __fw_cfg_machine_reset(s);
+    }
+}
+
 static void fw_cfg_machine_ready(struct Notifier *n, void *data)
 {
     FWCfgState *s = container_of(n, FWCfgState, machine_ready);
+
     qemu_register_reset(fw_cfg_machine_reset, s);
+    s->runstate_transition.notify = fw_cfg_runstate_transition;
+    qemu_add_runstate_transition_notifier(&s->runstate_transition);
 }
 
 static const Property fw_cfg_properties[] = {
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 56f17a0bdc..81ac940119 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -66,6 +66,7 @@ struct FWCfgState {
     uint16_t cur_entry;
     uint32_t cur_offset;
     Notifier machine_ready;
+    Notifier runstate_transition;
 
     bool dma_enabled;
     dma_addr_t dma_addr;
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure
  2026-03-10 17:06 [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Dongli Zhang
  2026-03-10 17:06 ` [PATCH v2 1/2] system/runstate: add runstate transition notifier Dongli Zhang
  2026-03-10 17:06 ` [PATCH v2 2/2] hw/nvram/fw_cfg: update bootorder and bios-geometry before launching Dongli Zhang
@ 2026-03-11 11:34 ` Gerd Hoffmann
  2026-03-12 21:22   ` Dongli Zhang
  2 siblings, 1 reply; 6+ messages in thread
From: Gerd Hoffmann @ 2026-03-11 11:34 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: qemu-devel, pbonzini, philmd, joe.jin

  Hi,

> 1. Create a QEMU instance with a virtio-scsi HBA and keep it in the
> prelaunch state. Use SeaBIOS rather than UEFI.
> 
> -device virtio-scsi-pci,id=scsi0,num_queues=4 \
> -S \
> 
> 2. First, attach the boot disk, then attach the secondary disk.
> 
> (qemu) drive_add 0 file=boot.qcow2,if=none,id=drive0
> (qemu) device_add scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=1,bootindex=1
> (qemu) drive_add 0 file=secondary.qcow2,if=none,id=drive1
> (qemu) device_add scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=0,lun=2,bootindex=-1
> 
> 3. Start the VM from the prelaunch state. Because the "bootorder" and
> "bios-geometry" data in fw_cfg is stale, SeaBIOS attempts to boot from the
> secondary disk only once and then stops. As a result, the VM fails to boot.
> 
> One possible workaround is to require QEMU users to explicitly issue a
> system_reset before starting a guest VM from the prelaunch state, if any
> disks have been attached or detached.

Requiring a system_reset looks sensible to me.  Neither seabios nor ovmf
have hotplug support.  So if you expect the firmware pick up any new
devices after hot-plugging them it is IMHO a very good idea to force a
full re-initialization via system_reset for robustness reasons.

take care,
  Gerd



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure
  2026-03-11 11:34 ` [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Gerd Hoffmann
@ 2026-03-12 21:22   ` Dongli Zhang
  2026-03-13  9:52     ` Gerd Hoffmann
  0 siblings, 1 reply; 6+ messages in thread
From: Dongli Zhang @ 2026-03-12 21:22 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: qemu-devel, pbonzini, philmd, joe.jin

Hi Gerd,

On 3/11/26 4:34 AM, Gerd Hoffmann wrote:
>   Hi,
> 
>> 1. Create a QEMU instance with a virtio-scsi HBA and keep it in the
>> prelaunch state. Use SeaBIOS rather than UEFI.
>>
>> -device virtio-scsi-pci,id=scsi0,num_queues=4 \
>> -S \
>>
>> 2. First, attach the boot disk, then attach the secondary disk.
>>
>> (qemu) drive_add 0 file=boot.qcow2,if=none,id=drive0
>> (qemu) device_add scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=1,bootindex=1
>> (qemu) drive_add 0 file=secondary.qcow2,if=none,id=drive1
>> (qemu) device_add scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=0,lun=2,bootindex=-1
>>
>> 3. Start the VM from the prelaunch state. Because the "bootorder" and
>> "bios-geometry" data in fw_cfg is stale, SeaBIOS attempts to boot from the
>> secondary disk only once and then stops. As a result, the VM fails to boot.
>>
>> One possible workaround is to require QEMU users to explicitly issue a
>> system_reset before starting a guest VM from the prelaunch state, if any
>> disks have been attached or detached.
> 
> Requiring a system_reset looks sensible to me.  Neither seabios nor ovmf
> have hotplug support.  So if you expect the firmware pick up any new
> devices after hot-plugging them it is IMHO a very good idea to force a
> full re-initialization via system_reset for robustness reasons.
> 

Regarding "Neither seabios nor ovmf have hotplug support", we are not using
hotplug with SeaBIOS or OVMF in this scenario.

This is a cold-plug operation performed during QEMU prelaunch state, before any
system software starts running.

(1) Create QEMU with "-S" at prelaunch.
(2) Attach disks.
(3) "cont" for QEMU.

In other words, we cold-plug the new disks before SeaBIOS starts.

However, because the "bootorder" data from QEMU is stale, SeaBIOS cannot
construct the correct boot order once it begins executing.

So this does not involve hotplug; it is purely cold-plug during prelaunch, prior
to SeaBIOS startup.

Therefore, would you mind helping confirm if to issue a system_reset manually is
still the best option so far?

Thank you very much!

Dongli Zhang


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure
  2026-03-12 21:22   ` Dongli Zhang
@ 2026-03-13  9:52     ` Gerd Hoffmann
  0 siblings, 0 replies; 6+ messages in thread
From: Gerd Hoffmann @ 2026-03-13  9:52 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: qemu-devel, pbonzini, philmd, joe.jin

  Hi,

> Regarding "Neither seabios nor ovmf have hotplug support", we are not using
> hotplug with SeaBIOS or OVMF in this scenario.
> 
> This is a cold-plug operation performed during QEMU prelaunch state, before any
> system software starts running.
> 
> (1) Create QEMU with "-S" at prelaunch.
> (2) Attach disks.
> (3) "cont" for QEMU.
> 
> In other words, we cold-plug the new disks before SeaBIOS starts.
> 
> However, because the "bootorder" data from QEMU is stale, SeaBIOS cannot
> construct the correct boot order once it begins executing.

system_reset will fix that.

> Therefore, would you mind helping confirm if to issue a system_reset manually is
> still the best option so far?

Yes, I think so.  Specifically it has the advantage to work just fine
with already released qemu versions, and sending one extra command
between (2) + (3) should not be much of a problem for the management.

take care,
  Gerd



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-13  9:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 17:06 [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Dongli Zhang
2026-03-10 17:06 ` [PATCH v2 1/2] system/runstate: add runstate transition notifier Dongli Zhang
2026-03-10 17:06 ` [PATCH v2 2/2] hw/nvram/fw_cfg: update bootorder and bios-geometry before launching Dongli Zhang
2026-03-11 11:34 ` [PATCH v2 0/2] hw/nvram/fw_cfg: update bootorder to fix SeaBIOS bootup failure Gerd Hoffmann
2026-03-12 21:22   ` Dongli Zhang
2026-03-13  9:52     ` Gerd Hoffmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox