* [PATCH v2 0/2] xl suspend/resume commands
@ 2024-11-26 17:19 Jason Andryuk
2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk
2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk
0 siblings, 2 replies; 8+ messages in thread
From: Jason Andryuk @ 2024-11-26 17:19 UTC (permalink / raw)
To: xen-devel; +Cc: Jason Andryuk, Anthony PERARD
This is a v2 of Cyril's implementation of xl suspend/resume. This is a
cooperative operation like libvirt's suspend.
I renamed suspend-to-ram to just suspend. xl also needed a fixup to
avoid exiting when the domain suspends.
Jason Andryuk (1):
xl: Keep monitoring suspended domain
zithro / Cyril Rébert (1):
tools/xl: add suspend and resume subcommands
docs/man/xl.1.pod.in | 12 +++++++++
tools/xl/xl.h | 3 +++
tools/xl/xl_cmdtable.c | 10 ++++++++
tools/xl/xl_vmcontrol.c | 54 ++++++++++++++++++++++++++++++++++++++++-
4 files changed, 78 insertions(+), 1 deletion(-)
--
2.34.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] xl: Keep monitoring suspended domain
2024-11-26 17:19 [PATCH v2 0/2] xl suspend/resume commands Jason Andryuk
@ 2024-11-26 17:19 ` Jason Andryuk
2024-11-28 16:45 ` Anthony PERARD
2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk
1 sibling, 1 reply; 8+ messages in thread
From: Jason Andryuk @ 2024-11-26 17:19 UTC (permalink / raw)
To: xen-devel; +Cc: Jason Andryuk, Anthony PERARD
When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon
was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart".
Later, when the VM actually shutdown, the missing xl daemon meant the
domain wasn't cleaned up properly.
Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon
keeps running to react to future shutdown events.
The domain death event needs to be re-enabled to catch subsequent
events. The libxl_evgen_domain_death is moved from death_list to
death_reported, and then it isn't found on subsequent iterations through
death_list. We enable the new event before disabling the old event, to
keep the xenstore watch active. If it is unregistered and
re-registered, it'll fire immediately for our suspended domain which
will end up continuously re-triggering.
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
---
tools/xl/xl.h | 1 +
tools/xl/xl_vmcontrol.c | 18 +++++++++++++++++-
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index 9c86bb1d98..967d034cfe 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -301,6 +301,7 @@ typedef enum {
DOMAIN_RESTART_NORMAL, /* Domain should be restarted */
DOMAIN_RESTART_RENAME, /* Domain should be renamed and restarted */
DOMAIN_RESTART_SOFT_RESET, /* Soft reset should be performed */
+ DOMAIN_RESTART_SUSPENDED, /* Domain suspended - keep looping */
} domain_restart_type;
extern void printf_info_sexp(int domid, libxl_domain_config *d_config, FILE *fh);
diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c
index fa1a4420e3..c45d497c28 100644
--- a/tools/xl/xl_vmcontrol.c
+++ b/tools/xl/xl_vmcontrol.c
@@ -417,7 +417,7 @@ static domain_restart_type handle_domain_death(uint32_t *r_domid,
break;
case LIBXL_SHUTDOWN_REASON_SUSPEND:
LOG("Domain has suspended.");
- return 0;
+ return DOMAIN_RESTART_SUSPENDED;
case LIBXL_SHUTDOWN_REASON_CRASH:
action = d_config->on_crash;
break;
@@ -1030,6 +1030,7 @@ start:
}
}
while (1) {
+ libxl_evgen_domain_death *deathw2 = NULL;
libxl_event *event;
ret = domain_wait_event(domid, &event);
if (ret) goto out;
@@ -1100,9 +1101,24 @@ start:
ret = 0;
goto out;
+ case DOMAIN_RESTART_SUSPENDED:
+ LOG("Continue waiting for domain %u", domid);
+ /*
+ * Enable a new event before disabling the old. This ensures
+ * the xenstore watch remains active. Otherwise it'll fire
+ * immediately on re-registration and find our suspended domain.
+ */
+ ret = libxl_evenable_domain_death(ctx, domid, 0, &deathw2);
+ if (ret) goto out;
+ libxl_evdisable_domain_death(ctx, deathw);
+ deathw = deathw2;
+ deathw2 = NULL;
+ break;
+
default:
abort();
}
+ break;
case LIBXL_EVENT_TYPE_DOMAIN_DEATH:
LOG("Domain %u has been destroyed.", domid);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] tools/xl: add suspend and resume subcommands
2024-11-26 17:19 [PATCH v2 0/2] xl suspend/resume commands Jason Andryuk
2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk
@ 2024-11-26 17:19 ` Jason Andryuk
2024-11-28 17:19 ` Anthony PERARD
1 sibling, 1 reply; 8+ messages in thread
From: Jason Andryuk @ 2024-11-26 17:19 UTC (permalink / raw)
To: xen-devel
Cc: zithro / Cyril Rébert, Anthony PERARD, Andrew Cooper,
Marek Marczykowski-Górecki, Jason Andryuk
From: zithro / Cyril Rébert <slack@rabbit.lu>
The xl command doesn't provide suspend/resume, so add them :
xl suspend <Domain>
xl resume <Domain>
This patch follows a discussion on XenDevel: when you want the
virtualized equivalent of "sleep"-ing a host, it's better to
suspend/resume than to pause/unpause a domain.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Cyril Rébert (zithro) <slack@rabbit.lu>
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
---
Rename command to just "suspend"
Move inside the HAVE_NO_SUSPEND_RESUME
---
docs/man/xl.1.pod.in | 12 ++++++++++++
tools/xl/xl.h | 2 ++
tools/xl/xl_cmdtable.c | 10 ++++++++++
tools/xl/xl_vmcontrol.c | 36 ++++++++++++++++++++++++++++++++++++
4 files changed, 60 insertions(+)
diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index bed8393473..fe38724b2b 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -682,6 +682,10 @@ Pass the VNC password to vncviewer via stdin.
=back
+=item B<resume> I<domain-id>
+
+Resume a domain, after having been suspended.
+
=item B<save> [I<OPTIONS>] I<domain-id> I<checkpointfile> [I<configfile>]
Saves a running domain to a state file so that it can be restored
@@ -760,6 +764,14 @@ in response to this event.
=back
+=item B<suspend> I<domain-id>
+
+Suspend a domain. This is a cooperative operation where the domain must
+respond to the xenstore trigger. When in a suspended state the domain
+still consumes allocated resources (such as memory), but is not eligible
+for scheduling by the Xen hypervisor. It is in a shutdown state, but
+not dying.
+
=item B<sysrq> I<domain-id> I<letter>
Send a <Magic System Request> to the domain, each type of request is
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index 967d034cfe..45745f0dbb 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -129,6 +129,8 @@ int main_restore(int argc, char **argv);
int main_migrate_receive(int argc, char **argv);
int main_save(int argc, char **argv);
int main_migrate(int argc, char **argv);
+int main_suspend(int argc, char **argv);
+int main_resume(int argc, char **argv);
#endif
int main_dump_core(int argc, char **argv);
int main_pause(int argc, char **argv);
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 53fc22d344..06a0039718 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -193,6 +193,16 @@ const struct cmd_spec cmd_table[] = {
"Restore a domain from a saved state",
"- for internal use only",
},
+ { "suspend",
+ &main_suspend, 0, 1,
+ "Suspend a domain to RAM",
+ "<Domain>",
+ },
+ { "resume",
+ &main_resume, 0, 1,
+ "Resume a domain from RAM",
+ "<Domain>",
+ },
#endif
{ "dump-core",
&main_dump_core, 0, 1,
diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c
index c45d497c28..3160966972 100644
--- a/tools/xl/xl_vmcontrol.c
+++ b/tools/xl/xl_vmcontrol.c
@@ -42,6 +42,16 @@ static void unpause_domain(uint32_t domid)
libxl_domain_unpause(ctx, domid, NULL);
}
+static void suspend_domain(uint32_t domid)
+{
+ libxl_domain_suspend_only(ctx, domid, NULL);
+}
+
+static void resume_domain(uint32_t domid)
+{
+ libxl_domain_resume(ctx, domid, 1, NULL);
+}
+
static void destroy_domain(uint32_t domid, int force)
{
int rc;
@@ -82,6 +92,32 @@ int main_unpause(int argc, char **argv)
return EXIT_SUCCESS;
}
+int main_suspend(int argc, char **argv)
+{
+ int opt;
+
+ SWITCH_FOREACH_OPT(opt, "", NULL, "suspend", 1) {
+ /* No options */
+ }
+
+ suspend_domain(find_domain(argv[optind]));
+
+ return EXIT_SUCCESS;
+}
+
+int main_resume(int argc, char **argv)
+{
+ int opt;
+
+ SWITCH_FOREACH_OPT(opt, "", NULL, "resume", 1) {
+ /* No options */
+ }
+
+ resume_domain(find_domain(argv[optind]));
+
+ return EXIT_SUCCESS;
+}
+
int main_destroy(int argc, char **argv)
{
int opt;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] xl: Keep monitoring suspended domain
2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk
@ 2024-11-28 16:45 ` Anthony PERARD
2024-12-02 8:53 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: Anthony PERARD @ 2024-11-28 16:45 UTC (permalink / raw)
To: Jason Andryuk; +Cc: xen-devel
On Tue, Nov 26, 2024 at 12:19:40PM -0500, Jason Andryuk wrote:
> When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon
> was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart".
> Later, when the VM actually shutdown, the missing xl daemon meant the
> domain wasn't cleaned up properly.
>
> Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon
> keeps running to react to future shutdown events.
>
> The domain death event needs to be re-enabled to catch subsequent
> events. The libxl_evgen_domain_death is moved from death_list to
> death_reported, and then it isn't found on subsequent iterations through
> death_list. We enable the new event before disabling the old event, to
> keep the xenstore watch active. If it is unregistered and
> re-registered, it'll fire immediately for our suspended domain which
> will end up continuously re-triggering.
>
> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Thanks,
--
Anthony Perard | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] tools/xl: add suspend and resume subcommands
2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk
@ 2024-11-28 17:19 ` Anthony PERARD
2024-12-02 23:30 ` Jason Andryuk
0 siblings, 1 reply; 8+ messages in thread
From: Anthony PERARD @ 2024-11-28 17:19 UTC (permalink / raw)
To: Jason Andryuk
Cc: xen-devel, zithro / Cyril Rébert, Andrew Cooper,
Marek Marczykowski-Górecki
On Tue, Nov 26, 2024 at 12:19:41PM -0500, Jason Andryuk wrote:
> diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c
> index c45d497c28..3160966972 100644
> --- a/tools/xl/xl_vmcontrol.c
> +++ b/tools/xl/xl_vmcontrol.c
> @@ -42,6 +42,16 @@ static void unpause_domain(uint32_t domid)
> libxl_domain_unpause(ctx, domid, NULL);
> }
>
> +static void suspend_domain(uint32_t domid)
> +{
> + libxl_domain_suspend_only(ctx, domid, NULL);
> +}
> +
> +static void resume_domain(uint32_t domid)
> +{
> + libxl_domain_resume(ctx, domid, 1, NULL);
> +}
> +
> static void destroy_domain(uint32_t domid, int force)
> {
> int rc;
> @@ -82,6 +92,32 @@ int main_unpause(int argc, char **argv)
> return EXIT_SUCCESS;
> }
>
> +int main_suspend(int argc, char **argv)
> +{
> + int opt;
> +
> + SWITCH_FOREACH_OPT(opt, "", NULL, "suspend", 1) {
> + /* No options */
> + }
> +
> + suspend_domain(find_domain(argv[optind]));
> +
> + return EXIT_SUCCESS;
> +}
> +
> +int main_resume(int argc, char **argv)
> +{
> + int opt;
> +
> + SWITCH_FOREACH_OPT(opt, "", NULL, "resume", 1) {
> + /* No options */
> + }
> +
> + resume_domain(find_domain(argv[optind]));
> +
> + return EXIT_SUCCESS;
> +}
These four new functions in xl_vmcontrol.c needs to be hidden behind
LIBXL_HAVE_NO_SUSPEND_RESUME, like the whole xl_migrate.c file is.
Both prototypes for main_*() are already hidden as well as the new
command in xl_cmdtables.
Or alternatively, we could probably have the command been present on
Arm, but I don't know if libxl_domain_suspend_only() is going to work.
It looks like it only depends on the hypervisor. I can't find any logic
that would treat Arm differently, besides the presence of
LIBXL_HAVE_NO_SUSPEND_RESUME.
But best bet would be to hide those four functions when
LIBXL_HAVE_NO_SUSPEND_RESUME is defined.
Thanks,
--
Anthony Perard | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] xl: Keep monitoring suspended domain
2024-11-28 16:45 ` Anthony PERARD
@ 2024-12-02 8:53 ` Jan Beulich
2024-12-02 23:30 ` Jason Andryuk
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2024-12-02 8:53 UTC (permalink / raw)
To: Anthony PERARD, Jason Andryuk; +Cc: xen-devel
On 28.11.2024 17:45, Anthony PERARD wrote:
> On Tue, Nov 26, 2024 at 12:19:40PM -0500, Jason Andryuk wrote:
>> When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon
>> was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart".
>> Later, when the VM actually shutdown, the missing xl daemon meant the
>> domain wasn't cleaned up properly.
>>
>> Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon
>> keeps running to react to future shutdown events.
>>
>> The domain death event needs to be re-enabled to catch subsequent
>> events. The libxl_evgen_domain_death is moved from death_list to
>> death_reported, and then it isn't found on subsequent iterations through
>> death_list. We enable the new event before disabling the old event, to
>> keep the xenstore watch active. If it is unregistered and
>> re-registered, it'll fire immediately for our suspended domain which
>> will end up continuously re-triggering.
>>
>> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
>
> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
While committing I was wondering: Does this want/need backporting (and hence
was it perhaps lacking a Fixes: tag)?
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] xl: Keep monitoring suspended domain
2024-12-02 8:53 ` Jan Beulich
@ 2024-12-02 23:30 ` Jason Andryuk
0 siblings, 0 replies; 8+ messages in thread
From: Jason Andryuk @ 2024-12-02 23:30 UTC (permalink / raw)
To: Jan Beulich, Anthony PERARD; +Cc: xen-devel
On 2024-12-02 03:53, Jan Beulich wrote:
> On 28.11.2024 17:45, Anthony PERARD wrote:
>> On Tue, Nov 26, 2024 at 12:19:40PM -0500, Jason Andryuk wrote:
>>> When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon
>>> was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart".
>>> Later, when the VM actually shutdown, the missing xl daemon meant the
>>> domain wasn't cleaned up properly.
>>>
>>> Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon
>>> keeps running to react to future shutdown events.
>>>
>>> The domain death event needs to be re-enabled to catch subsequent
>>> events. The libxl_evgen_domain_death is moved from death_list to
>>> death_reported, and then it isn't found on subsequent iterations through
>>> death_list. We enable the new event before disabling the old event, to
>>> keep the xenstore watch active. If it is unregistered and
>>> re-registered, it'll fire immediately for our suspended domain which
>>> will end up continuously re-triggering.
>>>
>>> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
>>
>> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
>
> While committing I was wondering: Does this want/need backporting (and hence
> was it perhaps lacking a Fixes: tag)?
Thanks, Jan.
I don't think it's really worth backporting. Mainly, it hasn't been an
issue in the last 14 years. A Linux domU doesn't suspend itself - it
only does so in response to a xenstore watch. A domU *could* suspend
itself without the xenstore watch, but that doesn't seem to happen in
practice. Since xl has not been able to generate those xenstore events
prior to the `xl suspend` introduction, this code path hasn't run or
been an issue.
The tag would be:
Fixes: 1a0e17891f ("xl: support on_{poweroff,reboot,crash} domain
configuration options.")
Regards,
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] tools/xl: add suspend and resume subcommands
2024-11-28 17:19 ` Anthony PERARD
@ 2024-12-02 23:30 ` Jason Andryuk
0 siblings, 0 replies; 8+ messages in thread
From: Jason Andryuk @ 2024-12-02 23:30 UTC (permalink / raw)
To: Anthony PERARD
Cc: xen-devel, zithro / Cyril Rébert, Andrew Cooper,
Marek Marczykowski-Górecki
On 2024-11-28 12:19, Anthony PERARD wrote:
> On Tue, Nov 26, 2024 at 12:19:41PM -0500, Jason Andryuk wrote:
>> diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c
>> index c45d497c28..3160966972 100644
>> --- a/tools/xl/xl_vmcontrol.c
>> +++ b/tools/xl/xl_vmcontrol.c
>> @@ -42,6 +42,16 @@ static void unpause_domain(uint32_t domid)
>> libxl_domain_unpause(ctx, domid, NULL);
>> }
>>
>> +static void suspend_domain(uint32_t domid)
>> +{
>> + libxl_domain_suspend_only(ctx, domid, NULL);
>> +}
>> +
>> +static void resume_domain(uint32_t domid)
>> +{
>> + libxl_domain_resume(ctx, domid, 1, NULL);
>> +}
>> +
>> static void destroy_domain(uint32_t domid, int force)
>> {
>> int rc;
>> @@ -82,6 +92,32 @@ int main_unpause(int argc, char **argv)
>> return EXIT_SUCCESS;
>> }
>>
>> +int main_suspend(int argc, char **argv)
>> +{
>> + int opt;
>> +
>> + SWITCH_FOREACH_OPT(opt, "", NULL, "suspend", 1) {
>> + /* No options */
>> + }
>> +
>> + suspend_domain(find_domain(argv[optind]));
>> +
>> + return EXIT_SUCCESS;
>> +}
>> +
>> +int main_resume(int argc, char **argv)
>> +{
>> + int opt;
>> +
>> + SWITCH_FOREACH_OPT(opt, "", NULL, "resume", 1) {
>> + /* No options */
>> + }
>> +
>> + resume_domain(find_domain(argv[optind]));
>> +
>> + return EXIT_SUCCESS;
>> +}
>
> These four new functions in xl_vmcontrol.c needs to be hidden behind
> LIBXL_HAVE_NO_SUSPEND_RESUME, like the whole xl_migrate.c file is.
> Both prototypes for main_*() are already hidden as well as the new
> command in xl_cmdtables.
>
> Or alternatively, we could probably have the command been present on
> Arm, but I don't know if libxl_domain_suspend_only() is going to work.
> It looks like it only depends on the hypervisor. I can't find any logic
> that would treat Arm differently, besides the presence of
> LIBXL_HAVE_NO_SUSPEND_RESUME.
>
> But best bet would be to hide those four functions when
> LIBXL_HAVE_NO_SUSPEND_RESUME is defined.
Thanks. Yes, I'll hide them.
Regards,
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-12-02 23:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-26 17:19 [PATCH v2 0/2] xl suspend/resume commands Jason Andryuk
2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk
2024-11-28 16:45 ` Anthony PERARD
2024-12-02 8:53 ` Jan Beulich
2024-12-02 23:30 ` Jason Andryuk
2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk
2024-11-28 17:19 ` Anthony PERARD
2024-12-02 23:30 ` Jason Andryuk
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.