All of lore.kernel.org
 help / color / mirror / Atom feed
* xl create/save throwing errors
@ 2025-02-19 16:04 Petr Beneš
  2025-02-19 16:53 ` Petr Beneš
  2025-02-20  8:14 ` Jan Beulich
  0 siblings, 2 replies; 11+ messages in thread
From: Petr Beneš @ 2025-02-19 16:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Anthony PERARD, Andrew Cooper

Hello,

I have a script that's supposed to start a couple of (Windows 10) VMs
in parallel, wait until they boot and connect to the network, and then
create a live snapshot.

VMs are created by simple "xl create vm.cfg" and the live snapshot is
created by "xl save win10-18362-NNN path/to/state".

I have noticed, that "xl create" occasionally throws this line:
```
libxl: error: libxl_aoutils.c:646:libxl__kill_xs_path: qemu
command-line probe already exited
```

First I thought it's related to the fact that multiple "xl create"
commands are being run in parallel, but to my surprise, this line
sometimes occurs even for standalone "xl create" commands.

However, when "xl save" is being executed in parallel, I'm very often
met with output similar to this:
```
Saving to win10-18362-102/state new xl format (info 0x3/0x0/1780)
xc: info: Saving domain 193, type x86 HVM
Saving to win10-18362-101/state new xl format (info 0x3/0x0/1780)
xc: info: Saving domain 192, type x86 HVM
Saving to win10-18362-104/state new xl format (info 0x3/0x0/1780)
xc: info: Saving domain 194, type x86 HVM
xc: error: save callback suspend() failed: 0: Internal error
xc: error: Save failed (0 = Success): Internal error
libxl: error: libxl_stream_write.c:347:libxl__xc_domain_save_done:
Domain 192:saving domain: domain responded to suspend request: Success
Failed to save domain, resuming domain
xc: error: save callback suspend() failed: 0: Internal error
xc: error: Save failed (0 = Success): Internal error
xc: error: Dom 192 not suspended: (shutdown 4, reason 3): Internal error
libxl: error: libxl_dom_suspend.c:661:domain_resume_done: Domain
192:xc_domain_resume failed: Invalid argument
libxl: error: libxl_stream_write.c:347:libxl__xc_domain_save_done:
Domain 194:saving domain: domain responded to suspend request: Success
Failed to save domain, resuming domain
xc: error: Dom 194 not suspended: (shutdown 4, reason 3): Internal error
libxl: error: libxl_dom_suspend.c:661:domain_resume_done: Domain
194:xc_domain_resume failed: Invalid argument
xc: Frames: 1044480/1044480  100%: Frames: 52224/1044480    5%
```

Here's an output of snapshotting 4 live VMs in parallel, where 3 of
the commands failed, and left the VMs in a running state.

Note that each "xl create"/"xl save" is executed for a separate VM.

For several months, I have executed standalone "xl save" commands with
VMs of the same settings without any problems.

Note that my VMs use qcow2 images as their disks - not ZFS or LVM:
```
disk = [ 'tap:qcow2:/win10-18362-101/clone/image.qcow2,xvda,w' ]
```

where win10-18362-101/clone/image.qcow2 is created as:
```
qemu-img create -f qcow2 -F qcow2 -b
"/win10-18362-101/base/image.qcow2"
"/win10-18362-101/clone/image.qcow2"
```

Is running "xl save" in parallel not supported? Or is it an issue with
qcow2 handling?

Best,
Petr


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-02-26  4:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-19 16:04 xl create/save throwing errors Petr Beneš
2025-02-19 16:53 ` Petr Beneš
2025-02-19 17:23   ` Petr Beneš
2025-02-19 18:08     ` Petr Beneš
2025-02-20  8:14 ` Jan Beulich
2025-02-25 22:59   ` Petr Beneš
2025-02-26  0:49     ` Marek Marczykowski-Górecki
2025-02-26  2:29       ` Petr Beneš
2025-02-26  3:23         ` Petr Beneš
2025-02-26  3:46           ` Petr Beneš
2025-02-26  4:10             ` Petr Beneš

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.