* [PATCH v2 0/2] xl suspend/resume commands @ 2024-11-26 17:19 Jason Andryuk 2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk 2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk 0 siblings, 2 replies; 8+ messages in thread From: Jason Andryuk @ 2024-11-26 17:19 UTC (permalink / raw) To: xen-devel; +Cc: Jason Andryuk, Anthony PERARD This is a v2 of Cyril's implementation of xl suspend/resume. This is a cooperative operation like libvirt's suspend. I renamed suspend-to-ram to just suspend. xl also needed a fixup to avoid exiting when the domain suspends. Jason Andryuk (1): xl: Keep monitoring suspended domain zithro / Cyril Rébert (1): tools/xl: add suspend and resume subcommands docs/man/xl.1.pod.in | 12 +++++++++ tools/xl/xl.h | 3 +++ tools/xl/xl_cmdtable.c | 10 ++++++++ tools/xl/xl_vmcontrol.c | 54 ++++++++++++++++++++++++++++++++++++++++- 4 files changed, 78 insertions(+), 1 deletion(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] xl: Keep monitoring suspended domain 2024-11-26 17:19 [PATCH v2 0/2] xl suspend/resume commands Jason Andryuk @ 2024-11-26 17:19 ` Jason Andryuk 2024-11-28 16:45 ` Anthony PERARD 2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk 1 sibling, 1 reply; 8+ messages in thread From: Jason Andryuk @ 2024-11-26 17:19 UTC (permalink / raw) To: xen-devel; +Cc: Jason Andryuk, Anthony PERARD When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart". Later, when the VM actually shutdown, the missing xl daemon meant the domain wasn't cleaned up properly. Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon keeps running to react to future shutdown events. The domain death event needs to be re-enabled to catch subsequent events. The libxl_evgen_domain_death is moved from death_list to death_reported, and then it isn't found on subsequent iterations through death_list. We enable the new event before disabling the old event, to keep the xenstore watch active. If it is unregistered and re-registered, it'll fire immediately for our suspended domain which will end up continuously re-triggering. Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> --- tools/xl/xl.h | 1 + tools/xl/xl_vmcontrol.c | 18 +++++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/tools/xl/xl.h b/tools/xl/xl.h index 9c86bb1d98..967d034cfe 100644 --- a/tools/xl/xl.h +++ b/tools/xl/xl.h @@ -301,6 +301,7 @@ typedef enum { DOMAIN_RESTART_NORMAL, /* Domain should be restarted */ DOMAIN_RESTART_RENAME, /* Domain should be renamed and restarted */ DOMAIN_RESTART_SOFT_RESET, /* Soft reset should be performed */ + DOMAIN_RESTART_SUSPENDED, /* Domain suspended - keep looping */ } domain_restart_type; extern void printf_info_sexp(int domid, libxl_domain_config *d_config, FILE *fh); diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c index fa1a4420e3..c45d497c28 100644 --- a/tools/xl/xl_vmcontrol.c +++ b/tools/xl/xl_vmcontrol.c @@ -417,7 +417,7 @@ static domain_restart_type handle_domain_death(uint32_t *r_domid, break; case LIBXL_SHUTDOWN_REASON_SUSPEND: LOG("Domain has suspended."); - return 0; + return DOMAIN_RESTART_SUSPENDED; case LIBXL_SHUTDOWN_REASON_CRASH: action = d_config->on_crash; break; @@ -1030,6 +1030,7 @@ start: } } while (1) { + libxl_evgen_domain_death *deathw2 = NULL; libxl_event *event; ret = domain_wait_event(domid, &event); if (ret) goto out; @@ -1100,9 +1101,24 @@ start: ret = 0; goto out; + case DOMAIN_RESTART_SUSPENDED: + LOG("Continue waiting for domain %u", domid); + /* + * Enable a new event before disabling the old. This ensures + * the xenstore watch remains active. Otherwise it'll fire + * immediately on re-registration and find our suspended domain. + */ + ret = libxl_evenable_domain_death(ctx, domid, 0, &deathw2); + if (ret) goto out; + libxl_evdisable_domain_death(ctx, deathw); + deathw = deathw2; + deathw2 = NULL; + break; + default: abort(); } + break; case LIBXL_EVENT_TYPE_DOMAIN_DEATH: LOG("Domain %u has been destroyed.", domid); -- 2.34.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] xl: Keep monitoring suspended domain 2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk @ 2024-11-28 16:45 ` Anthony PERARD 2024-12-02 8:53 ` Jan Beulich 0 siblings, 1 reply; 8+ messages in thread From: Anthony PERARD @ 2024-11-28 16:45 UTC (permalink / raw) To: Jason Andryuk; +Cc: xen-devel On Tue, Nov 26, 2024 at 12:19:40PM -0500, Jason Andryuk wrote: > When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon > was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart". > Later, when the VM actually shutdown, the missing xl daemon meant the > domain wasn't cleaned up properly. > > Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon > keeps running to react to future shutdown events. > > The domain death event needs to be re-enabled to catch subsequent > events. The libxl_evgen_domain_death is moved from death_list to > death_reported, and then it isn't found on subsequent iterations through > death_list. We enable the new event before disabling the old event, to > keep the xenstore watch active. If it is unregistered and > re-registered, it'll fire immediately for our suspended domain which > will end up continuously re-triggering. > > Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Thanks, -- Anthony Perard | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] xl: Keep monitoring suspended domain 2024-11-28 16:45 ` Anthony PERARD @ 2024-12-02 8:53 ` Jan Beulich 2024-12-02 23:30 ` Jason Andryuk 0 siblings, 1 reply; 8+ messages in thread From: Jan Beulich @ 2024-12-02 8:53 UTC (permalink / raw) To: Anthony PERARD, Jason Andryuk; +Cc: xen-devel On 28.11.2024 17:45, Anthony PERARD wrote: > On Tue, Nov 26, 2024 at 12:19:40PM -0500, Jason Andryuk wrote: >> When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon >> was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart". >> Later, when the VM actually shutdown, the missing xl daemon meant the >> domain wasn't cleaned up properly. >> >> Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon >> keeps running to react to future shutdown events. >> >> The domain death event needs to be re-enabled to catch subsequent >> events. The libxl_evgen_domain_death is moved from death_list to >> death_reported, and then it isn't found on subsequent iterations through >> death_list. We enable the new event before disabling the old event, to >> keep the xenstore watch active. If it is unregistered and >> re-registered, it'll fire immediately for our suspended domain which >> will end up continuously re-triggering. >> >> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> > > Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> While committing I was wondering: Does this want/need backporting (and hence was it perhaps lacking a Fixes: tag)? Jan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] xl: Keep monitoring suspended domain 2024-12-02 8:53 ` Jan Beulich @ 2024-12-02 23:30 ` Jason Andryuk 0 siblings, 0 replies; 8+ messages in thread From: Jason Andryuk @ 2024-12-02 23:30 UTC (permalink / raw) To: Jan Beulich, Anthony PERARD; +Cc: xen-devel On 2024-12-02 03:53, Jan Beulich wrote: > On 28.11.2024 17:45, Anthony PERARD wrote: >> On Tue, Nov 26, 2024 at 12:19:40PM -0500, Jason Andryuk wrote: >>> When a VM transitioned to LIBXL_SHUTDOWN_REASON_SUSPEND, the xl daemon >>> was exiting as 0 = DOMAIN_RESTART_NONE "No domain restart". >>> Later, when the VM actually shutdown, the missing xl daemon meant the >>> domain wasn't cleaned up properly. >>> >>> Add a new DOMAIN_RESTART_SUSPENDED to handle the case. The xl daemon >>> keeps running to react to future shutdown events. >>> >>> The domain death event needs to be re-enabled to catch subsequent >>> events. The libxl_evgen_domain_death is moved from death_list to >>> death_reported, and then it isn't found on subsequent iterations through >>> death_list. We enable the new event before disabling the old event, to >>> keep the xenstore watch active. If it is unregistered and >>> re-registered, it'll fire immediately for our suspended domain which >>> will end up continuously re-triggering. >>> >>> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> >> >> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> > > While committing I was wondering: Does this want/need backporting (and hence > was it perhaps lacking a Fixes: tag)? Thanks, Jan. I don't think it's really worth backporting. Mainly, it hasn't been an issue in the last 14 years. A Linux domU doesn't suspend itself - it only does so in response to a xenstore watch. A domU *could* suspend itself without the xenstore watch, but that doesn't seem to happen in practice. Since xl has not been able to generate those xenstore events prior to the `xl suspend` introduction, this code path hasn't run or been an issue. The tag would be: Fixes: 1a0e17891f ("xl: support on_{poweroff,reboot,crash} domain configuration options.") Regards, Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] tools/xl: add suspend and resume subcommands 2024-11-26 17:19 [PATCH v2 0/2] xl suspend/resume commands Jason Andryuk 2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk @ 2024-11-26 17:19 ` Jason Andryuk 2024-11-28 17:19 ` Anthony PERARD 1 sibling, 1 reply; 8+ messages in thread From: Jason Andryuk @ 2024-11-26 17:19 UTC (permalink / raw) To: xen-devel Cc: zithro / Cyril Rébert, Anthony PERARD, Andrew Cooper, Marek Marczykowski-Górecki, Jason Andryuk From: zithro / Cyril Rébert <slack@rabbit.lu> The xl command doesn't provide suspend/resume, so add them : xl suspend <Domain> xl resume <Domain> This patch follows a discussion on XenDevel: when you want the virtualized equivalent of "sleep"-ing a host, it's better to suspend/resume than to pause/unpause a domain. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Cyril Rébert (zithro) <slack@rabbit.lu> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> --- Rename command to just "suspend" Move inside the HAVE_NO_SUSPEND_RESUME --- docs/man/xl.1.pod.in | 12 ++++++++++++ tools/xl/xl.h | 2 ++ tools/xl/xl_cmdtable.c | 10 ++++++++++ tools/xl/xl_vmcontrol.c | 36 ++++++++++++++++++++++++++++++++++++ 4 files changed, 60 insertions(+) diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in index bed8393473..fe38724b2b 100644 --- a/docs/man/xl.1.pod.in +++ b/docs/man/xl.1.pod.in @@ -682,6 +682,10 @@ Pass the VNC password to vncviewer via stdin. =back +=item B<resume> I<domain-id> + +Resume a domain, after having been suspended. + =item B<save> [I<OPTIONS>] I<domain-id> I<checkpointfile> [I<configfile>] Saves a running domain to a state file so that it can be restored @@ -760,6 +764,14 @@ in response to this event. =back +=item B<suspend> I<domain-id> + +Suspend a domain. This is a cooperative operation where the domain must +respond to the xenstore trigger. When in a suspended state the domain +still consumes allocated resources (such as memory), but is not eligible +for scheduling by the Xen hypervisor. It is in a shutdown state, but +not dying. + =item B<sysrq> I<domain-id> I<letter> Send a <Magic System Request> to the domain, each type of request is diff --git a/tools/xl/xl.h b/tools/xl/xl.h index 967d034cfe..45745f0dbb 100644 --- a/tools/xl/xl.h +++ b/tools/xl/xl.h @@ -129,6 +129,8 @@ int main_restore(int argc, char **argv); int main_migrate_receive(int argc, char **argv); int main_save(int argc, char **argv); int main_migrate(int argc, char **argv); +int main_suspend(int argc, char **argv); +int main_resume(int argc, char **argv); #endif int main_dump_core(int argc, char **argv); int main_pause(int argc, char **argv); diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 53fc22d344..06a0039718 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -193,6 +193,16 @@ const struct cmd_spec cmd_table[] = { "Restore a domain from a saved state", "- for internal use only", }, + { "suspend", + &main_suspend, 0, 1, + "Suspend a domain to RAM", + "<Domain>", + }, + { "resume", + &main_resume, 0, 1, + "Resume a domain from RAM", + "<Domain>", + }, #endif { "dump-core", &main_dump_core, 0, 1, diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c index c45d497c28..3160966972 100644 --- a/tools/xl/xl_vmcontrol.c +++ b/tools/xl/xl_vmcontrol.c @@ -42,6 +42,16 @@ static void unpause_domain(uint32_t domid) libxl_domain_unpause(ctx, domid, NULL); } +static void suspend_domain(uint32_t domid) +{ + libxl_domain_suspend_only(ctx, domid, NULL); +} + +static void resume_domain(uint32_t domid) +{ + libxl_domain_resume(ctx, domid, 1, NULL); +} + static void destroy_domain(uint32_t domid, int force) { int rc; @@ -82,6 +92,32 @@ int main_unpause(int argc, char **argv) return EXIT_SUCCESS; } +int main_suspend(int argc, char **argv) +{ + int opt; + + SWITCH_FOREACH_OPT(opt, "", NULL, "suspend", 1) { + /* No options */ + } + + suspend_domain(find_domain(argv[optind])); + + return EXIT_SUCCESS; +} + +int main_resume(int argc, char **argv) +{ + int opt; + + SWITCH_FOREACH_OPT(opt, "", NULL, "resume", 1) { + /* No options */ + } + + resume_domain(find_domain(argv[optind])); + + return EXIT_SUCCESS; +} + int main_destroy(int argc, char **argv) { int opt; -- 2.34.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] tools/xl: add suspend and resume subcommands 2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk @ 2024-11-28 17:19 ` Anthony PERARD 2024-12-02 23:30 ` Jason Andryuk 0 siblings, 1 reply; 8+ messages in thread From: Anthony PERARD @ 2024-11-28 17:19 UTC (permalink / raw) To: Jason Andryuk Cc: xen-devel, zithro / Cyril Rébert, Andrew Cooper, Marek Marczykowski-Górecki On Tue, Nov 26, 2024 at 12:19:41PM -0500, Jason Andryuk wrote: > diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c > index c45d497c28..3160966972 100644 > --- a/tools/xl/xl_vmcontrol.c > +++ b/tools/xl/xl_vmcontrol.c > @@ -42,6 +42,16 @@ static void unpause_domain(uint32_t domid) > libxl_domain_unpause(ctx, domid, NULL); > } > > +static void suspend_domain(uint32_t domid) > +{ > + libxl_domain_suspend_only(ctx, domid, NULL); > +} > + > +static void resume_domain(uint32_t domid) > +{ > + libxl_domain_resume(ctx, domid, 1, NULL); > +} > + > static void destroy_domain(uint32_t domid, int force) > { > int rc; > @@ -82,6 +92,32 @@ int main_unpause(int argc, char **argv) > return EXIT_SUCCESS; > } > > +int main_suspend(int argc, char **argv) > +{ > + int opt; > + > + SWITCH_FOREACH_OPT(opt, "", NULL, "suspend", 1) { > + /* No options */ > + } > + > + suspend_domain(find_domain(argv[optind])); > + > + return EXIT_SUCCESS; > +} > + > +int main_resume(int argc, char **argv) > +{ > + int opt; > + > + SWITCH_FOREACH_OPT(opt, "", NULL, "resume", 1) { > + /* No options */ > + } > + > + resume_domain(find_domain(argv[optind])); > + > + return EXIT_SUCCESS; > +} These four new functions in xl_vmcontrol.c needs to be hidden behind LIBXL_HAVE_NO_SUSPEND_RESUME, like the whole xl_migrate.c file is. Both prototypes for main_*() are already hidden as well as the new command in xl_cmdtables. Or alternatively, we could probably have the command been present on Arm, but I don't know if libxl_domain_suspend_only() is going to work. It looks like it only depends on the hypervisor. I can't find any logic that would treat Arm differently, besides the presence of LIBXL_HAVE_NO_SUSPEND_RESUME. But best bet would be to hide those four functions when LIBXL_HAVE_NO_SUSPEND_RESUME is defined. Thanks, -- Anthony Perard | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] tools/xl: add suspend and resume subcommands 2024-11-28 17:19 ` Anthony PERARD @ 2024-12-02 23:30 ` Jason Andryuk 0 siblings, 0 replies; 8+ messages in thread From: Jason Andryuk @ 2024-12-02 23:30 UTC (permalink / raw) To: Anthony PERARD Cc: xen-devel, zithro / Cyril Rébert, Andrew Cooper, Marek Marczykowski-Górecki On 2024-11-28 12:19, Anthony PERARD wrote: > On Tue, Nov 26, 2024 at 12:19:41PM -0500, Jason Andryuk wrote: >> diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c >> index c45d497c28..3160966972 100644 >> --- a/tools/xl/xl_vmcontrol.c >> +++ b/tools/xl/xl_vmcontrol.c >> @@ -42,6 +42,16 @@ static void unpause_domain(uint32_t domid) >> libxl_domain_unpause(ctx, domid, NULL); >> } >> >> +static void suspend_domain(uint32_t domid) >> +{ >> + libxl_domain_suspend_only(ctx, domid, NULL); >> +} >> + >> +static void resume_domain(uint32_t domid) >> +{ >> + libxl_domain_resume(ctx, domid, 1, NULL); >> +} >> + >> static void destroy_domain(uint32_t domid, int force) >> { >> int rc; >> @@ -82,6 +92,32 @@ int main_unpause(int argc, char **argv) >> return EXIT_SUCCESS; >> } >> >> +int main_suspend(int argc, char **argv) >> +{ >> + int opt; >> + >> + SWITCH_FOREACH_OPT(opt, "", NULL, "suspend", 1) { >> + /* No options */ >> + } >> + >> + suspend_domain(find_domain(argv[optind])); >> + >> + return EXIT_SUCCESS; >> +} >> + >> +int main_resume(int argc, char **argv) >> +{ >> + int opt; >> + >> + SWITCH_FOREACH_OPT(opt, "", NULL, "resume", 1) { >> + /* No options */ >> + } >> + >> + resume_domain(find_domain(argv[optind])); >> + >> + return EXIT_SUCCESS; >> +} > > These four new functions in xl_vmcontrol.c needs to be hidden behind > LIBXL_HAVE_NO_SUSPEND_RESUME, like the whole xl_migrate.c file is. > Both prototypes for main_*() are already hidden as well as the new > command in xl_cmdtables. > > Or alternatively, we could probably have the command been present on > Arm, but I don't know if libxl_domain_suspend_only() is going to work. > It looks like it only depends on the hypervisor. I can't find any logic > that would treat Arm differently, besides the presence of > LIBXL_HAVE_NO_SUSPEND_RESUME. > > But best bet would be to hide those four functions when > LIBXL_HAVE_NO_SUSPEND_RESUME is defined. Thanks. Yes, I'll hide them. Regards, Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-12-02 23:30 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-26 17:19 [PATCH v2 0/2] xl suspend/resume commands Jason Andryuk 2024-11-26 17:19 ` [PATCH v2 1/2] xl: Keep monitoring suspended domain Jason Andryuk 2024-11-28 16:45 ` Anthony PERARD 2024-12-02 8:53 ` Jan Beulich 2024-12-02 23:30 ` Jason Andryuk 2024-11-26 17:19 ` [PATCH v2 2/2] tools/xl: add suspend and resume subcommands Jason Andryuk 2024-11-28 17:19 ` Anthony PERARD 2024-12-02 23:30 ` Jason Andryuk
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.