* Ping: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2025-12-12 13:18 ` [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests Mykola Kvach
@ 2026-01-14 6:00 ` Mykola Kvach
2026-01-15 9:03 ` Orzel, Michal
` (2 subsequent siblings)
3 siblings, 0 replies; 15+ messages in thread
From: Mykola Kvach @ 2026-01-14 6:00 UTC (permalink / raw)
To: Xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
Hi all,
Friendly ping on this series.
I believe all previously received comments have been addressed in this
revision, and I’m not aware of any remaining open issues.
Could you please take another look when you have a moment? If it looks good,
an Acked-by or Reviewed-by would be very appreciated; otherwise, I’m happy
to iterate on any further feedback.
Best regards,
Mykola
On Fri, Dec 12, 2025 at 3:22 PM Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> (both 32-bit and 64-bit variants).
>
> Implementation details:
> - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> - Trap and handle SYSTEM_SUSPEND in vPSCI
> - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> in hwdom_shutdown() via domain_shutdown
> - Require all secondary VCPUs of the calling domain to be offline before
> suspend, as mandated by the PSCI specification
>
> The arch_domain_resume() function is an architecture-specific hook that is
> invoked during domain resume to perform any necessary setup or restoration
> steps required by the platform. arch_domain_resume() stays int to propagate
> errno-style detail into common logging; preserving the integer keeps the
> reason visible and leaves room for future arch-specific failures or richer
> handling.
>
> The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> the vCPU context (such as entry point, some system regs and context ID) before
> resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> code and avoids intrusive changes to the generic resume flow.
>
> Usage:
>
> For Linux-based guests, suspend can be initiated with:
> echo mem > /sys/power/state
> or via:
> systemctl suspend
>
> Resuming the guest is performed from control domain using:
> xl resume <domain>
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V16:
> - Refactor error handling in domain_resume: move logging to generic code,
> use explicit return code checking.
> - Make context clearing conditional on success in arch_domain_resume.
> - The 'int' return type is retained for arch_domain_resume for consistency
> with other arch hooks and to allow for specific negative error codes.
> ---
> xen/arch/arm/domain.c | 39 +++++++++
> xen/arch/arm/include/asm/domain.h | 2 +
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 2 +
> xen/arch/arm/include/asm/suspend.h | 27 ++++++
> xen/arch/arm/include/asm/vpsci.h | 5 +-
> xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> xen/common/domain.c | 10 +++
> xen/include/xen/suspend.h | 25 ++++++
> 9 files changed, 205 insertions(+), 22 deletions(-)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/include/xen/suspend.h
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 47973f99d9..f903e7d4f0 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -12,6 +12,8 @@
> #include <xen/softirq.h>
> #include <xen/wait.h>
>
> +#include <public/sched.h>
> +
> #include <asm/arm64/sve.h>
> #include <asm/cpuerrata.h>
> #include <asm/cpufeature.h>
> @@ -24,10 +26,12 @@
> #include <asm/platform.h>
> #include <asm/procinfo.h>
> #include <asm/regs.h>
> +#include <asm/suspend.h>
> #include <asm/firmware/sci.h>
> #include <asm/tee/tee.h>
> #include <asm/vfp.h>
> #include <asm/vgic.h>
> +#include <asm/vpsci.h>
> #include <asm/vtimer.h>
>
> #include "vpci.h"
> @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> p2m_domain_creation_finished(d);
> }
>
> +int arch_domain_resume(struct domain *d)
> +{
> + int rc;
> + struct resume_info *ctx = &d->arch.resume_ctx;
> +
> + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> + {
> + dprintk(XENLOG_WARNING,
> + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> + d, d->is_shutting_down, d->shutdown_code);
> + return -EINVAL;
> + }
> +
> + /*
> + * It is still possible to call domain_shutdown() with a suspend reason
> + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> + * In these cases, the resume context will be empty.
> + * This is not expected to cause any issues, so we just notify about the
> + * situation and return without error, allowing the existing logic to
> + * proceed as expected.
> + */
> + if ( !ctx->wake_cpu )
> + {
> + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
> + d);
> + return 0;
> + }
> +
> + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
> + if ( !rc )
> + memset(ctx, 0, sizeof(*ctx));
> +
> + return rc;
> +}
> +
> static int is_guest_pv32_psr(uint32_t psr)
> {
> switch (psr & PSR_MODE_MASK)
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 758ad807e4..66b1246892 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -5,6 +5,7 @@
> #include <xen/timer.h>
> #include <asm/page.h>
> #include <asm/p2m.h>
> +#include <asm/suspend.h>
> #include <asm/vfp.h>
> #include <asm/mmio.h>
> #include <asm/gic.h>
> @@ -126,6 +127,7 @@ struct arch_domain
> void *sci_data;
> #endif
>
> + struct resume_info resume_ctx;
> } __cacheline_aligned;
>
> struct arch_vcpu
> diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> index effd25b69e..8dfcac7e3b 100644
> --- a/xen/arch/arm/include/asm/perfc_defn.h
> +++ b/xen/arch/arm/include/asm/perfc_defn.h
> @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> PERFCOUNTER(vpsci_features, "vpsci: features")
> +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>
> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>
> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> index 4780972621..48a93e6b79 100644
> --- a/xen/arch/arm/include/asm/psci.h
> +++ b/xen/arch/arm/include/asm/psci.h
> @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>
> #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>
> /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> new file mode 100644
> index 0000000000..313d03ea59
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef ARM_SUSPEND_H
> +#define ARM_SUSPEND_H
> +
> +struct domain;
> +struct vcpu;
> +
> +struct resume_info {
> + register_t ep;
> + register_t cid;
> + struct vcpu *wake_cpu;
> +};
> +
> +int arch_domain_resume(struct domain *d);
> +
> +#endif /* ARM_SUSPEND_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
> index 0cca5e6830..d790ab3715 100644
> --- a/xen/arch/arm/include/asm/vpsci.h
> +++ b/xen/arch/arm/include/asm/vpsci.h
> @@ -23,12 +23,15 @@
> #include <asm/psci.h>
>
> /* Number of function implemented by virtual PSCI (only 0.2 or later) */
> -#define VPSCI_NR_FUNCS 12
> +#define VPSCI_NR_FUNCS 14
>
> /* Functions handle PSCI calls from the guests */
> bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
> bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
>
> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> + register_t context_id);
> +
> #endif /* __ASM_VPSCI_H__ */
>
> /*
> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> index 7ba9ccd94b..c4d616ec68 100644
> --- a/xen/arch/arm/vpsci.c
> +++ b/xen/arch/arm/vpsci.c
> @@ -10,32 +10,16 @@
>
> #include <public/sched.h>
>
> -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> - register_t context_id)
> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> + register_t context_id)
> {
> - struct vcpu *v;
> - struct domain *d = current->domain;
> - struct vcpu_guest_context *ctxt;
> int rc;
> + struct domain *d = v->domain;
> bool is_thumb = entry_point & 1;
> - register_t vcpuid;
> -
> - vcpuid = vaffinity_to_vcpuid(target_cpu);
> -
> - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> - return PSCI_INVALID_PARAMETERS;
> -
> - /* THUMB set is not allowed with 64-bit domain */
> - if ( is_64bit_domain(d) && is_thumb )
> - return PSCI_INVALID_ADDRESS;
> -
> - if ( !test_bit(_VPF_down, &v->pause_flags) )
> - return PSCI_ALREADY_ON;
> + struct vcpu_guest_context *ctxt;
>
> if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> - return PSCI_DENIED;
> -
> - vgic_clear_pending_irqs(v);
> + return -ENOMEM;
>
> memset(ctxt, 0, sizeof(*ctxt));
> ctxt->user_regs.pc64 = (u64) entry_point;
> @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> free_vcpu_guest_context(ctxt);
>
> if ( rc < 0 )
> + return rc;
> +
> + return 0;
> +}
> +
> +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> + register_t context_id)
> +{
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + int rc;
> + bool is_thumb = entry_point & 1;
> + register_t vcpuid;
> +
> + vcpuid = vaffinity_to_vcpuid(target_cpu);
> +
> + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> + return PSCI_INVALID_PARAMETERS;
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + if ( !test_bit(_VPF_down, &v->pause_flags) )
> + return PSCI_ALREADY_ON;
> +
> + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
> + if ( rc )
> return PSCI_DENIED;
>
> + vgic_clear_pending_irqs(v);
> vcpu_wake(v);
>
> return PSCI_SUCCESS;
> @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
> domain_shutdown(d,SHUTDOWN_reboot);
> }
>
> +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> +{
> + int32_t rc;
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + bool is_thumb = epoint & 1;
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> + if ( is_hardware_domain(d) )
> + return PSCI_NOT_SUPPORTED;
> +
> + /* Ensure that all CPUs other than the calling one are offline */
> + domain_lock(d);
> + for_each_vcpu ( d, v )
> + {
> + if ( v != current && is_vcpu_online(v) )
> + {
> + domain_unlock(d);
> + return PSCI_DENIED;
> + }
> + }
> + domain_unlock(d);
> +
> + rc = domain_shutdown(d, SHUTDOWN_suspend);
> + if ( rc )
> + return PSCI_DENIED;
> +
> + d->arch.resume_ctx.ep = epoint;
> + d->arch.resume_ctx.cid = cid;
> + d->arch.resume_ctx.wake_cpu = current;
> +
> + gprintk(XENLOG_DEBUG,
> + "SYSTEM_SUSPEND requested, epoint=%#"PRIregister", cid=%#"PRIregister"\n",
> + epoint, cid);
> +
> + return rc;
> +}
> +
> static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> {
> /* /!\ Ordered by function ID and not name */
> @@ -214,6 +269,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> case PSCI_0_2_FN32_SYSTEM_OFF:
> case PSCI_0_2_FN32_SYSTEM_RESET:
> case PSCI_1_0_FN32_PSCI_FEATURES:
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> case ARM_SMCCC_VERSION_FID:
> return 0;
> default:
> @@ -344,6 +401,23 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
> return true;
> }
>
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> + {
> + register_t epoint = PSCI_ARG(regs, 1);
> + register_t cid = PSCI_ARG(regs, 2);
> +
> + if ( fid == PSCI_1_0_FN32_SYSTEM_SUSPEND )
> + {
> + epoint &= GENMASK(31, 0);
> + cid &= GENMASK(31, 0);
> + }
> +
> + perfc_incr(vpsci_system_suspend);
> + PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
> + return true;
> + }
> +
> default:
> return false;
> }
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 93c71bc766..09ad0a26ee 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -26,6 +26,7 @@
> #include <xen/hypercall.h>
> #include <xen/delay.h>
> #include <xen/shutdown.h>
> +#include <xen/suspend.h>
> #include <xen/percpu.h>
> #include <xen/multicall.h>
> #include <xen/rcupdate.h>
> @@ -1374,6 +1375,7 @@ int domain_shutdown(struct domain *d, u8 reason)
> void domain_resume(struct domain *d)
> {
> struct vcpu *v;
> + int rc;
>
> /*
> * Some code paths assume that shutdown status does not get reset under
> @@ -1383,6 +1385,13 @@ void domain_resume(struct domain *d)
>
> spin_lock(&d->shutdown_lock);
>
> + rc = arch_domain_resume(d);
> + if ( rc )
> + {
> + printk("%pd: Failed to resume domain (ret %d)\n", d, rc);
> + goto fail;
> + }
> +
> d->is_shutting_down = d->is_shut_down = 0;
> d->shutdown_code = SHUTDOWN_CODE_INVALID;
>
> @@ -1393,6 +1402,7 @@ void domain_resume(struct domain *d)
> v->paused_for_shutdown = 0;
> }
>
> + fail:
> spin_unlock(&d->shutdown_lock);
>
> domain_unpause(d);
> diff --git a/xen/include/xen/suspend.h b/xen/include/xen/suspend.h
> new file mode 100644
> index 0000000000..528879c2a9
> --- /dev/null
> +++ b/xen/include/xen/suspend.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef XEN_SUSPEND_H
> +#define XEN_SUSPEND_H
> +
> +#if __has_include(<asm/suspend.h>)
> +#include <asm/suspend.h>
> +#else
> +static inline int arch_domain_resume(struct domain *d)
> +{
> + return 0;
> +}
> +#endif
> +
> +#endif /* XEN_SUSPEND_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2025-12-12 13:18 ` [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests Mykola Kvach
2026-01-14 6:00 ` Ping: " Mykola Kvach
@ 2026-01-15 9:03 ` Orzel, Michal
2026-01-29 11:30 ` Mykola Kvach
2026-03-18 4:18 ` Volodymyr Babchuk
2026-03-19 9:36 ` Orzel, Michal
3 siblings, 1 reply; 15+ messages in thread
From: Orzel, Michal @ 2026-01-15 9:03 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
On 12/12/2025 14:18, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> (both 32-bit and 64-bit variants).
>
> Implementation details:
> - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> - Trap and handle SYSTEM_SUSPEND in vPSCI
> - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> in hwdom_shutdown() via domain_shutdown
> - Require all secondary VCPUs of the calling domain to be offline before
> suspend, as mandated by the PSCI specification
>
> The arch_domain_resume() function is an architecture-specific hook that is
> invoked during domain resume to perform any necessary setup or restoration
> steps required by the platform. arch_domain_resume() stays int to propagate
> errno-style detail into common logging; preserving the integer keeps the
> reason visible and leaves room for future arch-specific failures or richer
> handling.
>
> The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> the vCPU context (such as entry point, some system regs and context ID) before
> resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> code and avoids intrusive changes to the generic resume flow.
>
> Usage:
>
> For Linux-based guests, suspend can be initiated with:
> echo mem > /sys/power/state
> or via:
> systemctl suspend
>
> Resuming the guest is performed from control domain using:
> xl resume <domain>
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V16:
> - Refactor error handling in domain_resume: move logging to generic code,
> use explicit return code checking.
> - Make context clearing conditional on success in arch_domain_resume.
> - The 'int' return type is retained for arch_domain_resume for consistency
> with other arch hooks and to allow for specific negative error codes.
> ---
> xen/arch/arm/domain.c | 39 +++++++++
> xen/arch/arm/include/asm/domain.h | 2 +
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 2 +
> xen/arch/arm/include/asm/suspend.h | 27 ++++++
> xen/arch/arm/include/asm/vpsci.h | 5 +-
> xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> xen/common/domain.c | 10 +++
> xen/include/xen/suspend.h | 25 ++++++
> 9 files changed, 205 insertions(+), 22 deletions(-)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/include/xen/suspend.h
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 47973f99d9..f903e7d4f0 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -12,6 +12,8 @@
> #include <xen/softirq.h>
> #include <xen/wait.h>
>
> +#include <public/sched.h>
> +
> #include <asm/arm64/sve.h>
> #include <asm/cpuerrata.h>
> #include <asm/cpufeature.h>
> @@ -24,10 +26,12 @@
> #include <asm/platform.h>
> #include <asm/procinfo.h>
> #include <asm/regs.h>
> +#include <asm/suspend.h>
> #include <asm/firmware/sci.h>
> #include <asm/tee/tee.h>
> #include <asm/vfp.h>
> #include <asm/vgic.h>
> +#include <asm/vpsci.h>
> #include <asm/vtimer.h>
>
> #include "vpci.h"
> @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> p2m_domain_creation_finished(d);
> }
>
> +int arch_domain_resume(struct domain *d)
> +{
> + int rc;
> + struct resume_info *ctx = &d->arch.resume_ctx;
> +
> + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
How does this check and returning -EINVAL correspond to...
> + {
> + dprintk(XENLOG_WARNING,
> + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> + d, d->is_shutting_down, d->shutdown_code);
> + return -EINVAL;
> + }
> +
> + /*
> + * It is still possible to call domain_shutdown() with a suspend reason
> + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> + * In these cases, the resume context will be empty.
this comment? This patch assumes that we can now resume successfully (i.e. this
function returns 0 and common domain_resume can continue) only if the shutdown
was with SCHEDOP_shutdown. Anything else will infinitely keep the vCPUS paused.
Other than that, the patch looks good.
~Michal
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2026-01-15 9:03 ` Orzel, Michal
@ 2026-01-29 11:30 ` Mykola Kvach
2026-03-19 9:28 ` Orzel, Michal
0 siblings, 1 reply; 15+ messages in thread
From: Mykola Kvach @ 2026-01-29 11:30 UTC (permalink / raw)
To: Orzel, Michal
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Hi Michal,
Thanks for the review -- I think there are two separate concerns here
(domain state vs. ARM-specific resume context), and it’s easy to conflate
them.
On Thu, Jan 15, 2026 at 11:03 AM Orzel, Michal <michal.orzel@amd.com> wrote:
>
>
>
> On 12/12/2025 14:18, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> > allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> > (both 32-bit and 64-bit variants).
> >
> > Implementation details:
> > - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> > - Trap and handle SYSTEM_SUSPEND in vPSCI
> > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> > in hwdom_shutdown() via domain_shutdown
> > - Require all secondary VCPUs of the calling domain to be offline before
> > suspend, as mandated by the PSCI specification
> >
> > The arch_domain_resume() function is an architecture-specific hook that is
> > invoked during domain resume to perform any necessary setup or restoration
> > steps required by the platform. arch_domain_resume() stays int to propagate
> > errno-style detail into common logging; preserving the integer keeps the
> > reason visible and leaves room for future arch-specific failures or richer
> > handling.
> >
> > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> > the vCPU context (such as entry point, some system regs and context ID) before
> > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> > code and avoids intrusive changes to the generic resume flow.
> >
> > Usage:
> >
> > For Linux-based guests, suspend can be initiated with:
> > echo mem > /sys/power/state
> > or via:
> > systemctl suspend
> >
> > Resuming the guest is performed from control domain using:
> > xl resume <domain>
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V16:
> > - Refactor error handling in domain_resume: move logging to generic code,
> > use explicit return code checking.
> > - Make context clearing conditional on success in arch_domain_resume.
> > - The 'int' return type is retained for arch_domain_resume for consistency
> > with other arch hooks and to allow for specific negative error codes.
> > ---
> > xen/arch/arm/domain.c | 39 +++++++++
> > xen/arch/arm/include/asm/domain.h | 2 +
> > xen/arch/arm/include/asm/perfc_defn.h | 1 +
> > xen/arch/arm/include/asm/psci.h | 2 +
> > xen/arch/arm/include/asm/suspend.h | 27 ++++++
> > xen/arch/arm/include/asm/vpsci.h | 5 +-
> > xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> > xen/common/domain.c | 10 +++
> > xen/include/xen/suspend.h | 25 ++++++
> > 9 files changed, 205 insertions(+), 22 deletions(-)
> > create mode 100644 xen/arch/arm/include/asm/suspend.h
> > create mode 100644 xen/include/xen/suspend.h
> >
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > index 47973f99d9..f903e7d4f0 100644
> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -12,6 +12,8 @@
> > #include <xen/softirq.h>
> > #include <xen/wait.h>
> >
> > +#include <public/sched.h>
> > +
> > #include <asm/arm64/sve.h>
> > #include <asm/cpuerrata.h>
> > #include <asm/cpufeature.h>
> > @@ -24,10 +26,12 @@
> > #include <asm/platform.h>
> > #include <asm/procinfo.h>
> > #include <asm/regs.h>
> > +#include <asm/suspend.h>
> > #include <asm/firmware/sci.h>
> > #include <asm/tee/tee.h>
> > #include <asm/vfp.h>
> > #include <asm/vgic.h>
> > +#include <asm/vpsci.h>
> > #include <asm/vtimer.h>
> >
> > #include "vpci.h"
> > @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> > p2m_domain_creation_finished(d);
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + int rc;
> > + struct resume_info *ctx = &d->arch.resume_ctx;
> > +
> > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> How does this check and returning -EINVAL correspond to...
The
if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
guard is meant to validate the domain state for the DOMCTL resumedomain
entry point (i.e. reject an xl resume on a domain that isn’t in the
"suspend shutdown" state). Suspend requests issued via SCHEDOP_shutdown /
SCHEDOP_remote_shutdown with reason SUSPEND still put the domain into
is_shutting_down=1 and shutdown_code=SHUTDOWN_suspend, so they do pass
this state check.
What the comment below was trying to say is different: those hypercall
paths don’t go through vPSCI SYSTEM_SUSPEND, so they don’t populate the
Arm-specific resume context (notably wake_cpu). In that case ctx->wake_cpu
remains NULL and the Arm arch_domain_resume() returns early.
>
> > + {
> > + dprintk(XENLOG_WARNING,
> > + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> > + d, d->is_shutting_down, d->shutdown_code);
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * It is still possible to call domain_shutdown() with a suspend reason
> > + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> > + * In these cases, the resume context will be empty.
> this comment? This patch assumes that we can now resume successfully (i.e. this
> function returns 0 and common domain_resume can continue) only if the shutdown
> was with SCHEDOP_shutdown. Anything else will infinitely keep the vCPUS paused.
Separately (and this is why I’m hesitant to make domain_resume()
"suspend-only"): domain_resume() is also used by the soft-reset flow.
Today soft-reset is effectively x86-only (gated by HAS_SOFT_RESET), but
the core plumbing is in common code and is intentionally generic -- the
soft-reset calling chain ends up using domain_resume() as a generic
helper. If domain_resume() itself starts rejecting anything other than
SHUTDOWN_suspend, it would also be a future trap if/when someone
enables HAS_SOFT_RESET on Arm.
>
> Other than that, the patch looks good.
>
> ~Michal
>
Best regards,
Mykola
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2026-01-29 11:30 ` Mykola Kvach
@ 2026-03-19 9:28 ` Orzel, Michal
0 siblings, 0 replies; 15+ messages in thread
From: Orzel, Michal @ 2026-03-19 9:28 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
On 29/01/2026 12:30, Mykola Kvach wrote:
> Hi Michal,
>
> Thanks for the review -- I think there are two separate concerns here
> (domain state vs. ARM-specific resume context), and it’s easy to conflate
> them.
>
> On Thu, Jan 15, 2026 at 11:03 AM Orzel, Michal <michal.orzel@amd.com> wrote:
>>
>>
>>
>> On 12/12/2025 14:18, Mykola Kvach wrote:
>>> From: Mykola Kvach <mykola_kvach@epam.com>
>>>
>>> Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
>>> allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
>>> (both 32-bit and 64-bit variants).
>>>
>>> Implementation details:
>>> - Add SYSTEM_SUSPEND function IDs to PSCI definitions
>>> - Trap and handle SYSTEM_SUSPEND in vPSCI
>>> - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
>>> PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
>>> in hwdom_shutdown() via domain_shutdown
>>> - Require all secondary VCPUs of the calling domain to be offline before
>>> suspend, as mandated by the PSCI specification
>>>
>>> The arch_domain_resume() function is an architecture-specific hook that is
>>> invoked during domain resume to perform any necessary setup or restoration
>>> steps required by the platform. arch_domain_resume() stays int to propagate
>>> errno-style detail into common logging; preserving the integer keeps the
>>> reason visible and leaves room for future arch-specific failures or richer
>>> handling.
>>>
>>> The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
>>> the vCPU context (such as entry point, some system regs and context ID) before
>>> resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
>>> code and avoids intrusive changes to the generic resume flow.
>>>
>>> Usage:
>>>
>>> For Linux-based guests, suspend can be initiated with:
>>> echo mem > /sys/power/state
>>> or via:
>>> systemctl suspend
>>>
>>> Resuming the guest is performed from control domain using:
>>> xl resume <domain>
>>>
>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>> ---
>>> Changes in V16:
>>> - Refactor error handling in domain_resume: move logging to generic code,
>>> use explicit return code checking.
>>> - Make context clearing conditional on success in arch_domain_resume.
>>> - The 'int' return type is retained for arch_domain_resume for consistency
>>> with other arch hooks and to allow for specific negative error codes.
>>> ---
>>> xen/arch/arm/domain.c | 39 +++++++++
>>> xen/arch/arm/include/asm/domain.h | 2 +
>>> xen/arch/arm/include/asm/perfc_defn.h | 1 +
>>> xen/arch/arm/include/asm/psci.h | 2 +
>>> xen/arch/arm/include/asm/suspend.h | 27 ++++++
>>> xen/arch/arm/include/asm/vpsci.h | 5 +-
>>> xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
>>> xen/common/domain.c | 10 +++
>>> xen/include/xen/suspend.h | 25 ++++++
>>> 9 files changed, 205 insertions(+), 22 deletions(-)
>>> create mode 100644 xen/arch/arm/include/asm/suspend.h
>>> create mode 100644 xen/include/xen/suspend.h
>>>
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 47973f99d9..f903e7d4f0 100644
>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -12,6 +12,8 @@
>>> #include <xen/softirq.h>
>>> #include <xen/wait.h>
>>>
>>> +#include <public/sched.h>
>>> +
>>> #include <asm/arm64/sve.h>
>>> #include <asm/cpuerrata.h>
>>> #include <asm/cpufeature.h>
>>> @@ -24,10 +26,12 @@
>>> #include <asm/platform.h>
>>> #include <asm/procinfo.h>
>>> #include <asm/regs.h>
>>> +#include <asm/suspend.h>
>>> #include <asm/firmware/sci.h>
>>> #include <asm/tee/tee.h>
>>> #include <asm/vfp.h>
>>> #include <asm/vgic.h>
>>> +#include <asm/vpsci.h>
>>> #include <asm/vtimer.h>
>>>
>>> #include "vpci.h"
>>> @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
>>> p2m_domain_creation_finished(d);
>>> }
>>>
>>> +int arch_domain_resume(struct domain *d)
>>> +{
>>> + int rc;
>>> + struct resume_info *ctx = &d->arch.resume_ctx;
>>> +
>>> + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
>> How does this check and returning -EINVAL correspond to...
>
> The
>
> if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
>
> guard is meant to validate the domain state for the DOMCTL resumedomain
> entry point (i.e. reject an xl resume on a domain that isn’t in the
> "suspend shutdown" state). Suspend requests issued via SCHEDOP_shutdown /
> SCHEDOP_remote_shutdown with reason SUSPEND still put the domain into
> is_shutting_down=1 and shutdown_code=SHUTDOWN_suspend, so they do pass
> this state check.
>
> What the comment below was trying to say is different: those hypercall
> paths don’t go through vPSCI SYSTEM_SUSPEND, so they don’t populate the
> Arm-specific resume context (notably wake_cpu). In that case ctx->wake_cpu
> remains NULL and the Arm arch_domain_resume() returns early.
>
>>
>>> + {
>>> + dprintk(XENLOG_WARNING,
>>> + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
>>> + d, d->is_shutting_down, d->shutdown_code);
>>> + return -EINVAL;
>>> + }
>>> +
>>> + /*
>>> + * It is still possible to call domain_shutdown() with a suspend reason
>>> + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
>>> + * In these cases, the resume context will be empty.
>> this comment? This patch assumes that we can now resume successfully (i.e. this
>> function returns 0 and common domain_resume can continue) only if the shutdown
>> was with SCHEDOP_shutdown. Anything else will infinitely keep the vCPUS paused.
>
> Separately (and this is why I’m hesitant to make domain_resume()
> "suspend-only"): domain_resume() is also used by the soft-reset flow.
> Today soft-reset is effectively x86-only (gated by HAS_SOFT_RESET), but
> the core plumbing is in common code and is intentionally generic -- the
> soft-reset calling chain ends up using domain_resume() as a generic
> helper. If domain_resume() itself starts rejecting anything other than
> SHUTDOWN_suspend, it would also be a future trap if/when someone
> enables HAS_SOFT_RESET on Arm.
Ok, thanks for explanation.
~Michal
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2025-12-12 13:18 ` [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests Mykola Kvach
2026-01-14 6:00 ` Ping: " Mykola Kvach
2026-01-15 9:03 ` Orzel, Michal
@ 2026-03-18 4:18 ` Volodymyr Babchuk
2026-03-18 6:35 ` Mykola Kvach
2026-03-19 9:36 ` Orzel, Michal
3 siblings, 1 reply; 15+ messages in thread
From: Volodymyr Babchuk @ 2026-03-18 4:18 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Hi Mykola.
Mykola Kvach <xakep.amatop@gmail.com> writes:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> (both 32-bit and 64-bit variants).
>
> Implementation details:
> - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> - Trap and handle SYSTEM_SUSPEND in vPSCI
> - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> in hwdom_shutdown() via domain_shutdown
> - Require all secondary VCPUs of the calling domain to be offline before
> suspend, as mandated by the PSCI specification
>
> The arch_domain_resume() function is an architecture-specific hook that is
> invoked during domain resume to perform any necessary setup or restoration
> steps required by the platform. arch_domain_resume() stays int to propagate
> errno-style detail into common logging; preserving the integer keeps the
> reason visible and leaves room for future arch-specific failures or richer
> handling.
>
> The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> the vCPU context (such as entry point, some system regs and context ID) before
> resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> code and avoids intrusive changes to the generic resume flow.
>
> Usage:
>
> For Linux-based guests, suspend can be initiated with:
> echo mem > /sys/power/state
> or via:
> systemctl suspend
>
> Resuming the guest is performed from control domain using:
> xl resume <domain>
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V16:
> - Refactor error handling in domain_resume: move logging to generic code,
> use explicit return code checking.
> - Make context clearing conditional on success in arch_domain_resume.
> - The 'int' return type is retained for arch_domain_resume for consistency
> with other arch hooks and to allow for specific negative error codes.
> ---
> xen/arch/arm/domain.c | 39 +++++++++
> xen/arch/arm/include/asm/domain.h | 2 +
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 2 +
> xen/arch/arm/include/asm/suspend.h | 27 ++++++
> xen/arch/arm/include/asm/vpsci.h | 5 +-
> xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> xen/common/domain.c | 10 +++
> xen/include/xen/suspend.h | 25 ++++++
> 9 files changed, 205 insertions(+), 22 deletions(-)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/include/xen/suspend.h
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 47973f99d9..f903e7d4f0 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -12,6 +12,8 @@
> #include <xen/softirq.h>
> #include <xen/wait.h>
>
> +#include <public/sched.h>
> +
> #include <asm/arm64/sve.h>
> #include <asm/cpuerrata.h>
> #include <asm/cpufeature.h>
> @@ -24,10 +26,12 @@
> #include <asm/platform.h>
> #include <asm/procinfo.h>
> #include <asm/regs.h>
> +#include <asm/suspend.h>
> #include <asm/firmware/sci.h>
> #include <asm/tee/tee.h>
> #include <asm/vfp.h>
> #include <asm/vgic.h>
> +#include <asm/vpsci.h>
> #include <asm/vtimer.h>
>
> #include "vpci.h"
> @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> p2m_domain_creation_finished(d);
> }
>
> +int arch_domain_resume(struct domain *d)
> +{
> + int rc;
> + struct resume_info *ctx = &d->arch.resume_ctx;
> +
> + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> + {
> + dprintk(XENLOG_WARNING,
> + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> + d, d->is_shutting_down, d->shutdown_code);
> + return -EINVAL;
> + }
> +
> + /*
> + * It is still possible to call domain_shutdown() with a suspend reason
> + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> + * In these cases, the resume context will be empty.
> + * This is not expected to cause any issues, so we just notify about the
> + * situation and return without error, allowing the existing logic to
> + * proceed as expected.
> + */
> + if ( !ctx->wake_cpu )
> + {
> + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
> + d);
> + return 0;
> + }
> +
> + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
> + if ( !rc )
> + memset(ctx, 0, sizeof(*ctx));
> +
> + return rc;
> +}
> +
> static int is_guest_pv32_psr(uint32_t psr)
> {
> switch (psr & PSR_MODE_MASK)
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 758ad807e4..66b1246892 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -5,6 +5,7 @@
> #include <xen/timer.h>
> #include <asm/page.h>
> #include <asm/p2m.h>
> +#include <asm/suspend.h>
> #include <asm/vfp.h>
> #include <asm/mmio.h>
> #include <asm/gic.h>
> @@ -126,6 +127,7 @@ struct arch_domain
> void *sci_data;
> #endif
>
> + struct resume_info resume_ctx;
> } __cacheline_aligned;
>
> struct arch_vcpu
> diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> index effd25b69e..8dfcac7e3b 100644
> --- a/xen/arch/arm/include/asm/perfc_defn.h
> +++ b/xen/arch/arm/include/asm/perfc_defn.h
> @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> PERFCOUNTER(vpsci_features, "vpsci: features")
> +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>
> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>
> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> index 4780972621..48a93e6b79 100644
> --- a/xen/arch/arm/include/asm/psci.h
> +++ b/xen/arch/arm/include/asm/psci.h
> @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>
> #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>
> /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> new file mode 100644
> index 0000000000..313d03ea59
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef ARM_SUSPEND_H
> +#define ARM_SUSPEND_H
> +
> +struct domain;
> +struct vcpu;
> +
> +struct resume_info {
> + register_t ep;
> + register_t cid;
> + struct vcpu *wake_cpu;
> +};
> +
> +int arch_domain_resume(struct domain *d);
> +
> +#endif /* ARM_SUSPEND_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
> index 0cca5e6830..d790ab3715 100644
> --- a/xen/arch/arm/include/asm/vpsci.h
> +++ b/xen/arch/arm/include/asm/vpsci.h
> @@ -23,12 +23,15 @@
> #include <asm/psci.h>
>
> /* Number of function implemented by virtual PSCI (only 0.2 or later) */
> -#define VPSCI_NR_FUNCS 12
> +#define VPSCI_NR_FUNCS 14
>
> /* Functions handle PSCI calls from the guests */
> bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
> bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
>
> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> + register_t context_id);
> +
> #endif /* __ASM_VPSCI_H__ */
>
> /*
> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> index 7ba9ccd94b..c4d616ec68 100644
> --- a/xen/arch/arm/vpsci.c
> +++ b/xen/arch/arm/vpsci.c
> @@ -10,32 +10,16 @@
>
> #include <public/sched.h>
>
> -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> - register_t context_id)
> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> + register_t context_id)
> {
> - struct vcpu *v;
> - struct domain *d = current->domain;
> - struct vcpu_guest_context *ctxt;
> int rc;
> + struct domain *d = v->domain;
> bool is_thumb = entry_point & 1;
> - register_t vcpuid;
> -
> - vcpuid = vaffinity_to_vcpuid(target_cpu);
> -
> - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> - return PSCI_INVALID_PARAMETERS;
> -
> - /* THUMB set is not allowed with 64-bit domain */
> - if ( is_64bit_domain(d) && is_thumb )
> - return PSCI_INVALID_ADDRESS;
> -
> - if ( !test_bit(_VPF_down, &v->pause_flags) )
> - return PSCI_ALREADY_ON;
> + struct vcpu_guest_context *ctxt;
>
> if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> - return PSCI_DENIED;
> -
> - vgic_clear_pending_irqs(v);
> + return -ENOMEM;
>
> memset(ctxt, 0, sizeof(*ctxt));
> ctxt->user_regs.pc64 = (u64) entry_point;
> @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> free_vcpu_guest_context(ctxt);
>
> if ( rc < 0 )
> + return rc;
> +
> + return 0;
> +}
> +
> +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> + register_t context_id)
> +{
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + int rc;
> + bool is_thumb = entry_point & 1;
> + register_t vcpuid;
> +
> + vcpuid = vaffinity_to_vcpuid(target_cpu);
> +
> + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> + return PSCI_INVALID_PARAMETERS;
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + if ( !test_bit(_VPF_down, &v->pause_flags) )
> + return PSCI_ALREADY_ON;
> +
> + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
> + if ( rc )
> return PSCI_DENIED;
>
> + vgic_clear_pending_irqs(v);
> vcpu_wake(v);
>
> return PSCI_SUCCESS;
> @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
> domain_shutdown(d,SHUTDOWN_reboot);
> }
>
> +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> +{
> + int32_t rc;
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + bool is_thumb = epoint & 1;
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> + if ( is_hardware_domain(d) )
> + return PSCI_NOT_SUPPORTED;
> +
> + /* Ensure that all CPUs other than the calling one are offline */
> + domain_lock(d);
> + for_each_vcpu ( d, v )
> + {
> + if ( v != current && is_vcpu_online(v) )
> + {
> + domain_unlock(d);
> + return PSCI_DENIED;
> + }
> + }
> + domain_unlock(d);
> +
> + rc = domain_shutdown(d, SHUTDOWN_suspend);
> + if ( rc )
> + return PSCI_DENIED;
> +
> + d->arch.resume_ctx.ep = epoint;
> + d->arch.resume_ctx.cid = cid;
> + d->arch.resume_ctx.wake_cpu = current;
> +
> + gprintk(XENLOG_DEBUG,
> + "SYSTEM_SUSPEND requested, epoint=%#"PRIregister", cid=%#"PRIregister"\n",
> + epoint, cid);
> +
> + return rc;
> +}
> +
> static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> {
> /* /!\ Ordered by function ID and not name */
> @@ -214,6 +269,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> case PSCI_0_2_FN32_SYSTEM_OFF:
> case PSCI_0_2_FN32_SYSTEM_RESET:
> case PSCI_1_0_FN32_PSCI_FEATURES:
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> case ARM_SMCCC_VERSION_FID:
> return 0;
> default:
> @@ -344,6 +401,23 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
> return true;
> }
>
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> + {
> + register_t epoint = PSCI_ARG(regs, 1);
> + register_t cid = PSCI_ARG(regs, 2);
> +
> + if ( fid == PSCI_1_0_FN32_SYSTEM_SUSPEND )
> + {
> + epoint &= GENMASK(31, 0);
> + cid &= GENMASK(31, 0);
> + }
> +
> + perfc_incr(vpsci_system_suspend);
> + PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
> + return true;
> + }
> +
> default:
> return false;
> }
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 93c71bc766..09ad0a26ee 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -26,6 +26,7 @@
> #include <xen/hypercall.h>
> #include <xen/delay.h>
> #include <xen/shutdown.h>
> +#include <xen/suspend.h>
> #include <xen/percpu.h>
> #include <xen/multicall.h>
> #include <xen/rcupdate.h>
> @@ -1374,6 +1375,7 @@ int domain_shutdown(struct domain *d, u8 reason)
> void domain_resume(struct domain *d)
> {
> struct vcpu *v;
> + int rc;
>
> /*
> * Some code paths assume that shutdown status does not get reset under
> @@ -1383,6 +1385,13 @@ void domain_resume(struct domain *d)
>
> spin_lock(&d->shutdown_lock);
>
> + rc = arch_domain_resume(d);
> + if ( rc )
> + {
> + printk("%pd: Failed to resume domain (ret %d)\n", d, rc);
I am wondering about this error path... Domain clearly can't be resumed
anymore. Should we crash it in this case? But domain is already shut
down, so domain_crash() would do nothing.
Probably it is better to ensure that arch_domain_resume() will not
return an error by doing all required checks beforehand. Actually you
already doing this. So how we can get an error realistically?
> + goto fail;
> + }
> +
> d->is_shutting_down = d->is_shut_down = 0;
> d->shutdown_code = SHUTDOWN_CODE_INVALID;
>
> @@ -1393,6 +1402,7 @@ void domain_resume(struct domain *d)
> v->paused_for_shutdown = 0;
> }
>
> + fail:
> spin_unlock(&d->shutdown_lock);
>
> domain_unpause(d);
> diff --git a/xen/include/xen/suspend.h b/xen/include/xen/suspend.h
> new file mode 100644
> index 0000000000..528879c2a9
> --- /dev/null
> +++ b/xen/include/xen/suspend.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef XEN_SUSPEND_H
> +#define XEN_SUSPEND_H
> +
> +#if __has_include(<asm/suspend.h>)
> +#include <asm/suspend.h>
> +#else
> +static inline int arch_domain_resume(struct domain *d)
> +{
> + return 0;
> +}
> +#endif
> +
> +#endif /* XEN_SUSPEND_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
--
WBR, Volodymyr
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2026-03-18 4:18 ` Volodymyr Babchuk
@ 2026-03-18 6:35 ` Mykola Kvach
2026-03-19 1:14 ` Volodymyr Babchuk
0 siblings, 1 reply; 15+ messages in thread
From: Mykola Kvach @ 2026-03-18 6:35 UTC (permalink / raw)
To: Volodymyr Babchuk
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Hi Volodymyr,
Thank you for the review.
On Wed, Mar 18, 2026 at 6:18 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
> Hi Mykola.
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> > allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> > (both 32-bit and 64-bit variants).
> >
> > Implementation details:
> > - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> > - Trap and handle SYSTEM_SUSPEND in vPSCI
> > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> > in hwdom_shutdown() via domain_shutdown
> > - Require all secondary VCPUs of the calling domain to be offline before
> > suspend, as mandated by the PSCI specification
> >
> > The arch_domain_resume() function is an architecture-specific hook that is
> > invoked during domain resume to perform any necessary setup or restoration
> > steps required by the platform. arch_domain_resume() stays int to propagate
> > errno-style detail into common logging; preserving the integer keeps the
> > reason visible and leaves room for future arch-specific failures or richer
> > handling.
> >
> > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> > the vCPU context (such as entry point, some system regs and context ID) before
> > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> > code and avoids intrusive changes to the generic resume flow.
> >
> > Usage:
> >
> > For Linux-based guests, suspend can be initiated with:
> > echo mem > /sys/power/state
> > or via:
> > systemctl suspend
> >
> > Resuming the guest is performed from control domain using:
> > xl resume <domain>
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V16:
> > - Refactor error handling in domain_resume: move logging to generic code,
> > use explicit return code checking.
> > - Make context clearing conditional on success in arch_domain_resume.
> > - The 'int' return type is retained for arch_domain_resume for consistency
> > with other arch hooks and to allow for specific negative error codes.
> > ---
> > xen/arch/arm/domain.c | 39 +++++++++
> > xen/arch/arm/include/asm/domain.h | 2 +
> > xen/arch/arm/include/asm/perfc_defn.h | 1 +
> > xen/arch/arm/include/asm/psci.h | 2 +
> > xen/arch/arm/include/asm/suspend.h | 27 ++++++
> > xen/arch/arm/include/asm/vpsci.h | 5 +-
> > xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> > xen/common/domain.c | 10 +++
> > xen/include/xen/suspend.h | 25 ++++++
> > 9 files changed, 205 insertions(+), 22 deletions(-)
> > create mode 100644 xen/arch/arm/include/asm/suspend.h
> > create mode 100644 xen/include/xen/suspend.h
> >
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > index 47973f99d9..f903e7d4f0 100644
> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -12,6 +12,8 @@
> > #include <xen/softirq.h>
> > #include <xen/wait.h>
> >
> > +#include <public/sched.h>
> > +
> > #include <asm/arm64/sve.h>
> > #include <asm/cpuerrata.h>
> > #include <asm/cpufeature.h>
> > @@ -24,10 +26,12 @@
> > #include <asm/platform.h>
> > #include <asm/procinfo.h>
> > #include <asm/regs.h>
> > +#include <asm/suspend.h>
> > #include <asm/firmware/sci.h>
> > #include <asm/tee/tee.h>
> > #include <asm/vfp.h>
> > #include <asm/vgic.h>
> > +#include <asm/vpsci.h>
> > #include <asm/vtimer.h>
> >
> > #include "vpci.h"
> > @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> > p2m_domain_creation_finished(d);
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + int rc;
> > + struct resume_info *ctx = &d->arch.resume_ctx;
> > +
> > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> > + {
> > + dprintk(XENLOG_WARNING,
> > + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> > + d, d->is_shutting_down, d->shutdown_code);
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * It is still possible to call domain_shutdown() with a suspend reason
> > + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> > + * In these cases, the resume context will be empty.
> > + * This is not expected to cause any issues, so we just notify about the
> > + * situation and return without error, allowing the existing logic to
> > + * proceed as expected.
> > + */
> > + if ( !ctx->wake_cpu )
> > + {
> > + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
> > + d);
> > + return 0;
> > + }
> > +
> > + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
> > + if ( !rc )
> > + memset(ctx, 0, sizeof(*ctx));
> > +
> > + return rc;
> > +}
> > +
> > static int is_guest_pv32_psr(uint32_t psr)
> > {
> > switch (psr & PSR_MODE_MASK)
> > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > index 758ad807e4..66b1246892 100644
> > --- a/xen/arch/arm/include/asm/domain.h
> > +++ b/xen/arch/arm/include/asm/domain.h
> > @@ -5,6 +5,7 @@
> > #include <xen/timer.h>
> > #include <asm/page.h>
> > #include <asm/p2m.h>
> > +#include <asm/suspend.h>
> > #include <asm/vfp.h>
> > #include <asm/mmio.h>
> > #include <asm/gic.h>
> > @@ -126,6 +127,7 @@ struct arch_domain
> > void *sci_data;
> > #endif
> >
> > + struct resume_info resume_ctx;
> > } __cacheline_aligned;
> >
> > struct arch_vcpu
> > diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> > index effd25b69e..8dfcac7e3b 100644
> > --- a/xen/arch/arm/include/asm/perfc_defn.h
> > +++ b/xen/arch/arm/include/asm/perfc_defn.h
> > @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> > PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> > PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> > PERFCOUNTER(vpsci_features, "vpsci: features")
> > +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
> >
> > PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
> >
> > diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> > index 4780972621..48a93e6b79 100644
> > --- a/xen/arch/arm/include/asm/psci.h
> > +++ b/xen/arch/arm/include/asm/psci.h
> > @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> > #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> > #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> > #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> > +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
> >
> > #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> > #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> > #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> > +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
> >
> > /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> > #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> > diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> > new file mode 100644
> > index 0000000000..313d03ea59
> > --- /dev/null
> > +++ b/xen/arch/arm/include/asm/suspend.h
> > @@ -0,0 +1,27 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#ifndef ARM_SUSPEND_H
> > +#define ARM_SUSPEND_H
> > +
> > +struct domain;
> > +struct vcpu;
> > +
> > +struct resume_info {
> > + register_t ep;
> > + register_t cid;
> > + struct vcpu *wake_cpu;
> > +};
> > +
> > +int arch_domain_resume(struct domain *d);
> > +
> > +#endif /* ARM_SUSPEND_H */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
> > index 0cca5e6830..d790ab3715 100644
> > --- a/xen/arch/arm/include/asm/vpsci.h
> > +++ b/xen/arch/arm/include/asm/vpsci.h
> > @@ -23,12 +23,15 @@
> > #include <asm/psci.h>
> >
> > /* Number of function implemented by virtual PSCI (only 0.2 or later) */
> > -#define VPSCI_NR_FUNCS 12
> > +#define VPSCI_NR_FUNCS 14
> >
> > /* Functions handle PSCI calls from the guests */
> > bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
> > bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
> >
> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> > + register_t context_id);
> > +
> > #endif /* __ASM_VPSCI_H__ */
> >
> > /*
> > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> > index 7ba9ccd94b..c4d616ec68 100644
> > --- a/xen/arch/arm/vpsci.c
> > +++ b/xen/arch/arm/vpsci.c
> > @@ -10,32 +10,16 @@
> >
> > #include <public/sched.h>
> >
> > -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > - register_t context_id)
> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> > + register_t context_id)
> > {
> > - struct vcpu *v;
> > - struct domain *d = current->domain;
> > - struct vcpu_guest_context *ctxt;
> > int rc;
> > + struct domain *d = v->domain;
> > bool is_thumb = entry_point & 1;
> > - register_t vcpuid;
> > -
> > - vcpuid = vaffinity_to_vcpuid(target_cpu);
> > -
> > - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> > - return PSCI_INVALID_PARAMETERS;
> > -
> > - /* THUMB set is not allowed with 64-bit domain */
> > - if ( is_64bit_domain(d) && is_thumb )
> > - return PSCI_INVALID_ADDRESS;
> > -
> > - if ( !test_bit(_VPF_down, &v->pause_flags) )
> > - return PSCI_ALREADY_ON;
> > + struct vcpu_guest_context *ctxt;
> >
> > if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> > - return PSCI_DENIED;
> > -
> > - vgic_clear_pending_irqs(v);
> > + return -ENOMEM;
> >
> > memset(ctxt, 0, sizeof(*ctxt));
> > ctxt->user_regs.pc64 = (u64) entry_point;
> > @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > free_vcpu_guest_context(ctxt);
> >
> > if ( rc < 0 )
> > + return rc;
> > +
> > + return 0;
> > +}
> > +
> > +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > + register_t context_id)
> > +{
> > + struct vcpu *v;
> > + struct domain *d = current->domain;
> > + int rc;
> > + bool is_thumb = entry_point & 1;
> > + register_t vcpuid;
> > +
> > + vcpuid = vaffinity_to_vcpuid(target_cpu);
> > +
> > + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> > + return PSCI_INVALID_PARAMETERS;
> > +
> > + /* THUMB set is not allowed with 64-bit domain */
> > + if ( is_64bit_domain(d) && is_thumb )
> > + return PSCI_INVALID_ADDRESS;
> > +
> > + if ( !test_bit(_VPF_down, &v->pause_flags) )
> > + return PSCI_ALREADY_ON;
> > +
> > + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
> > + if ( rc )
> > return PSCI_DENIED;
> >
> > + vgic_clear_pending_irqs(v);
> > vcpu_wake(v);
> >
> > return PSCI_SUCCESS;
> > @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
> > domain_shutdown(d,SHUTDOWN_reboot);
> > }
> >
> > +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> > +{
> > + int32_t rc;
> > + struct vcpu *v;
> > + struct domain *d = current->domain;
> > + bool is_thumb = epoint & 1;
> > +
> > + /* THUMB set is not allowed with 64-bit domain */
> > + if ( is_64bit_domain(d) && is_thumb )
> > + return PSCI_INVALID_ADDRESS;
> > +
> > + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> > + if ( is_hardware_domain(d) )
> > + return PSCI_NOT_SUPPORTED;
> > +
> > + /* Ensure that all CPUs other than the calling one are offline */
> > + domain_lock(d);
> > + for_each_vcpu ( d, v )
> > + {
> > + if ( v != current && is_vcpu_online(v) )
> > + {
> > + domain_unlock(d);
> > + return PSCI_DENIED;
> > + }
> > + }
> > + domain_unlock(d);
> > +
> > + rc = domain_shutdown(d, SHUTDOWN_suspend);
> > + if ( rc )
> > + return PSCI_DENIED;
> > +
> > + d->arch.resume_ctx.ep = epoint;
> > + d->arch.resume_ctx.cid = cid;
> > + d->arch.resume_ctx.wake_cpu = current;
> > +
> > + gprintk(XENLOG_DEBUG,
> > + "SYSTEM_SUSPEND requested, epoint=%#"PRIregister", cid=%#"PRIregister"\n",
> > + epoint, cid);
> > +
> > + return rc;
> > +}
> > +
> > static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> > {
> > /* /!\ Ordered by function ID and not name */
> > @@ -214,6 +269,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> > case PSCI_0_2_FN32_SYSTEM_OFF:
> > case PSCI_0_2_FN32_SYSTEM_RESET:
> > case PSCI_1_0_FN32_PSCI_FEATURES:
> > + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> > + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > case ARM_SMCCC_VERSION_FID:
> > return 0;
> > default:
> > @@ -344,6 +401,23 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
> > return true;
> > }
> >
> > + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> > + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > + {
> > + register_t epoint = PSCI_ARG(regs, 1);
> > + register_t cid = PSCI_ARG(regs, 2);
> > +
> > + if ( fid == PSCI_1_0_FN32_SYSTEM_SUSPEND )
> > + {
> > + epoint &= GENMASK(31, 0);
> > + cid &= GENMASK(31, 0);
> > + }
> > +
> > + perfc_incr(vpsci_system_suspend);
> > + PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
> > + return true;
> > + }
> > +
> > default:
> > return false;
> > }
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 93c71bc766..09ad0a26ee 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -26,6 +26,7 @@
> > #include <xen/hypercall.h>
> > #include <xen/delay.h>
> > #include <xen/shutdown.h>
> > +#include <xen/suspend.h>
> > #include <xen/percpu.h>
> > #include <xen/multicall.h>
> > #include <xen/rcupdate.h>
> > @@ -1374,6 +1375,7 @@ int domain_shutdown(struct domain *d, u8 reason)
> > void domain_resume(struct domain *d)
> > {
> > struct vcpu *v;
> > + int rc;
> >
> > /*
> > * Some code paths assume that shutdown status does not get reset under
> > @@ -1383,6 +1385,13 @@ void domain_resume(struct domain *d)
> >
> > spin_lock(&d->shutdown_lock);
> >
> > + rc = arch_domain_resume(d);
> > + if ( rc )
> > + {
> > + printk("%pd: Failed to resume domain (ret %d)\n", d, rc);
>
> I am wondering about this error path... Domain clearly can't be resumed
> anymore. Should we crash it in this case? But domain is already shut
> down, so domain_crash() would do nothing.
>
> Probably it is better to ensure that arch_domain_resume() will not
> return an error by doing all required checks beforehand. Actually you
> already doing this. So how we can get an error realistically?
vpsci_vcpu_up_prepare() can still fail on the resume path, currently due
to alloc_vcpu_guest_context() or arch_set_info_guest().
So I agree with your point: it would be better to move all failure-prone
preparation and validation to the suspend path, so that
arch_domain_resume() does not need to return an error in normal
operation.
The approach I see is:
- allocate and populate the guest context during suspend;
- split arch_set_info_guest() into two parts, something like
arch_validate_vcpu_guest_context() and
arch_apply_vcpu_guest_context();
- call only the validation step during suspend;
- on resume, only apply the already validated guest context and then
free it.
This should make the resume path effectively non-failing and avoid the
questionable error handling there.
Does this sound like the right direction?
Best regards,
Mykola
>
> > + goto fail;
> > + }
> > +
> > d->is_shutting_down = d->is_shut_down = 0;
> > d->shutdown_code = SHUTDOWN_CODE_INVALID;
> >
> > @@ -1393,6 +1402,7 @@ void domain_resume(struct domain *d)
> > v->paused_for_shutdown = 0;
> > }
> >
> > + fail:
> > spin_unlock(&d->shutdown_lock);
> >
> > domain_unpause(d);
> > diff --git a/xen/include/xen/suspend.h b/xen/include/xen/suspend.h
> > new file mode 100644
> > index 0000000000..528879c2a9
> > --- /dev/null
> > +++ b/xen/include/xen/suspend.h
> > @@ -0,0 +1,25 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#ifndef XEN_SUSPEND_H
> > +#define XEN_SUSPEND_H
> > +
> > +#if __has_include(<asm/suspend.h>)
> > +#include <asm/suspend.h>
> > +#else
> > +static inline int arch_domain_resume(struct domain *d)
> > +{
> > + return 0;
> > +}
> > +#endif
> > +
> > +#endif /* XEN_SUSPEND_H */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
>
> --
> WBR, Volodymyr
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2026-03-18 6:35 ` Mykola Kvach
@ 2026-03-19 1:14 ` Volodymyr Babchuk
0 siblings, 0 replies; 15+ messages in thread
From: Volodymyr Babchuk @ 2026-03-19 1:14 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Hi Mykola,
Mykola Kvach <xakep.amatop@gmail.com> writes:
> Hi Volodymyr,
>
> Thank you for the review.
>
> On Wed, Mar 18, 2026 at 6:18 AM Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com> wrote:
>>
>> Hi Mykola.
>>
>> Mykola Kvach <xakep.amatop@gmail.com> writes:
>>
>> > From: Mykola Kvach <mykola_kvach@epam.com>
>> >
>> > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
>> > allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
>> > (both 32-bit and 64-bit variants).
>> >
>> > Implementation details:
>> > - Add SYSTEM_SUSPEND function IDs to PSCI definitions
>> > - Trap and handle SYSTEM_SUSPEND in vPSCI
>> > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
>> > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
>> > in hwdom_shutdown() via domain_shutdown
>> > - Require all secondary VCPUs of the calling domain to be offline before
>> > suspend, as mandated by the PSCI specification
>> >
>> > The arch_domain_resume() function is an architecture-specific hook that is
>> > invoked during domain resume to perform any necessary setup or restoration
>> > steps required by the platform. arch_domain_resume() stays int to propagate
>> > errno-style detail into common logging; preserving the integer keeps the
>> > reason visible and leaves room for future arch-specific failures or richer
>> > handling.
>> >
>> > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
>> > the vCPU context (such as entry point, some system regs and context ID) before
>> > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
>> > code and avoids intrusive changes to the generic resume flow.
>> >
>> > Usage:
>> >
>> > For Linux-based guests, suspend can be initiated with:
>> > echo mem > /sys/power/state
>> > or via:
>> > systemctl suspend
>> >
>> > Resuming the guest is performed from control domain using:
>> > xl resume <domain>
>> >
>> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>> > ---
>> > Changes in V16:
>> > - Refactor error handling in domain_resume: move logging to generic code,
>> > use explicit return code checking.
>> > - Make context clearing conditional on success in arch_domain_resume.
>> > - The 'int' return type is retained for arch_domain_resume for consistency
>> > with other arch hooks and to allow for specific negative error codes.
>> > ---
>> > xen/arch/arm/domain.c | 39 +++++++++
>> > xen/arch/arm/include/asm/domain.h | 2 +
>> > xen/arch/arm/include/asm/perfc_defn.h | 1 +
>> > xen/arch/arm/include/asm/psci.h | 2 +
>> > xen/arch/arm/include/asm/suspend.h | 27 ++++++
>> > xen/arch/arm/include/asm/vpsci.h | 5 +-
>> > xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
>> > xen/common/domain.c | 10 +++
>> > xen/include/xen/suspend.h | 25 ++++++
>> > 9 files changed, 205 insertions(+), 22 deletions(-)
>> > create mode 100644 xen/arch/arm/include/asm/suspend.h
>> > create mode 100644 xen/include/xen/suspend.h
>> >
>> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> > index 47973f99d9..f903e7d4f0 100644
>> > --- a/xen/arch/arm/domain.c
>> > +++ b/xen/arch/arm/domain.c
>> > @@ -12,6 +12,8 @@
>> > #include <xen/softirq.h>
>> > #include <xen/wait.h>
>> >
>> > +#include <public/sched.h>
>> > +
>> > #include <asm/arm64/sve.h>
>> > #include <asm/cpuerrata.h>
>> > #include <asm/cpufeature.h>
>> > @@ -24,10 +26,12 @@
>> > #include <asm/platform.h>
>> > #include <asm/procinfo.h>
>> > #include <asm/regs.h>
>> > +#include <asm/suspend.h>
>> > #include <asm/firmware/sci.h>
>> > #include <asm/tee/tee.h>
>> > #include <asm/vfp.h>
>> > #include <asm/vgic.h>
>> > +#include <asm/vpsci.h>
>> > #include <asm/vtimer.h>
>> >
>> > #include "vpci.h"
>> > @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
>> > p2m_domain_creation_finished(d);
>> > }
>> >
>> > +int arch_domain_resume(struct domain *d)
>> > +{
>> > + int rc;
>> > + struct resume_info *ctx = &d->arch.resume_ctx;
>> > +
>> > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
>> > + {
>> > + dprintk(XENLOG_WARNING,
>> > + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
>> > + d, d->is_shutting_down, d->shutdown_code);
>> > + return -EINVAL;
>> > + }
>> > +
>> > + /*
>> > + * It is still possible to call domain_shutdown() with a suspend reason
>> > + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
>> > + * In these cases, the resume context will be empty.
>> > + * This is not expected to cause any issues, so we just notify about the
>> > + * situation and return without error, allowing the existing logic to
>> > + * proceed as expected.
>> > + */
>> > + if ( !ctx->wake_cpu )
>> > + {
>> > + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
>> > + d);
>> > + return 0;
>> > + }
>> > +
>> > + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
>> > + if ( !rc )
>> > + memset(ctx, 0, sizeof(*ctx));
>> > +
>> > + return rc;
>> > +}
>> > +
>> > static int is_guest_pv32_psr(uint32_t psr)
>> > {
>> > switch (psr & PSR_MODE_MASK)
>> > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>> > index 758ad807e4..66b1246892 100644
>> > --- a/xen/arch/arm/include/asm/domain.h
>> > +++ b/xen/arch/arm/include/asm/domain.h
>> > @@ -5,6 +5,7 @@
>> > #include <xen/timer.h>
>> > #include <asm/page.h>
>> > #include <asm/p2m.h>
>> > +#include <asm/suspend.h>
>> > #include <asm/vfp.h>
>> > #include <asm/mmio.h>
>> > #include <asm/gic.h>
>> > @@ -126,6 +127,7 @@ struct arch_domain
>> > void *sci_data;
>> > #endif
>> >
>> > + struct resume_info resume_ctx;
>> > } __cacheline_aligned;
>> >
>> > struct arch_vcpu
>> > diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
>> > index effd25b69e..8dfcac7e3b 100644
>> > --- a/xen/arch/arm/include/asm/perfc_defn.h
>> > +++ b/xen/arch/arm/include/asm/perfc_defn.h
>> > @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
>> > PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
>> > PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
>> > PERFCOUNTER(vpsci_features, "vpsci: features")
>> > +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>> >
>> > PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>> >
>> > diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
>> > index 4780972621..48a93e6b79 100644
>> > --- a/xen/arch/arm/include/asm/psci.h
>> > +++ b/xen/arch/arm/include/asm/psci.h
>> > @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
>> > #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
>> > #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
>> > #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
>> > +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>> >
>> > #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
>> > #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
>> > #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
>> > +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>> >
>> > /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
>> > #define PSCI_0_2_AFFINITY_LEVEL_ON 0
>> > diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
>> > new file mode 100644
>> > index 0000000000..313d03ea59
>> > --- /dev/null
>> > +++ b/xen/arch/arm/include/asm/suspend.h
>> > @@ -0,0 +1,27 @@
>> > +/* SPDX-License-Identifier: GPL-2.0-only */
>> > +
>> > +#ifndef ARM_SUSPEND_H
>> > +#define ARM_SUSPEND_H
>> > +
>> > +struct domain;
>> > +struct vcpu;
>> > +
>> > +struct resume_info {
>> > + register_t ep;
>> > + register_t cid;
>> > + struct vcpu *wake_cpu;
>> > +};
>> > +
>> > +int arch_domain_resume(struct domain *d);
>> > +
>> > +#endif /* ARM_SUSPEND_H */
>> > +
>> > +/*
>> > + * Local variables:
>> > + * mode: C
>> > + * c-file-style: "BSD"
>> > + * c-basic-offset: 4
>> > + * tab-width: 4
>> > + * indent-tabs-mode: nil
>> > + * End:
>> > + */
>> > diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
>> > index 0cca5e6830..d790ab3715 100644
>> > --- a/xen/arch/arm/include/asm/vpsci.h
>> > +++ b/xen/arch/arm/include/asm/vpsci.h
>> > @@ -23,12 +23,15 @@
>> > #include <asm/psci.h>
>> >
>> > /* Number of function implemented by virtual PSCI (only 0.2 or later) */
>> > -#define VPSCI_NR_FUNCS 12
>> > +#define VPSCI_NR_FUNCS 14
>> >
>> > /* Functions handle PSCI calls from the guests */
>> > bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
>> > bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
>> >
>> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
>> > + register_t context_id);
>> > +
>> > #endif /* __ASM_VPSCI_H__ */
>> >
>> > /*
>> > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
>> > index 7ba9ccd94b..c4d616ec68 100644
>> > --- a/xen/arch/arm/vpsci.c
>> > +++ b/xen/arch/arm/vpsci.c
>> > @@ -10,32 +10,16 @@
>> >
>> > #include <public/sched.h>
>> >
>> > -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
>> > - register_t context_id)
>> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
>> > + register_t context_id)
>> > {
>> > - struct vcpu *v;
>> > - struct domain *d = current->domain;
>> > - struct vcpu_guest_context *ctxt;
>> > int rc;
>> > + struct domain *d = v->domain;
>> > bool is_thumb = entry_point & 1;
>> > - register_t vcpuid;
>> > -
>> > - vcpuid = vaffinity_to_vcpuid(target_cpu);
>> > -
>> > - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
>> > - return PSCI_INVALID_PARAMETERS;
>> > -
>> > - /* THUMB set is not allowed with 64-bit domain */
>> > - if ( is_64bit_domain(d) && is_thumb )
>> > - return PSCI_INVALID_ADDRESS;
>> > -
>> > - if ( !test_bit(_VPF_down, &v->pause_flags) )
>> > - return PSCI_ALREADY_ON;
>> > + struct vcpu_guest_context *ctxt;
>> >
>> > if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>> > - return PSCI_DENIED;
>> > -
>> > - vgic_clear_pending_irqs(v);
>> > + return -ENOMEM;
>> >
>> > memset(ctxt, 0, sizeof(*ctxt));
>> > ctxt->user_regs.pc64 = (u64) entry_point;
>> > @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
>> > free_vcpu_guest_context(ctxt);
>> >
>> > if ( rc < 0 )
>> > + return rc;
>> > +
>> > + return 0;
>> > +}
>> > +
>> > +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
>> > + register_t context_id)
>> > +{
>> > + struct vcpu *v;
>> > + struct domain *d = current->domain;
>> > + int rc;
>> > + bool is_thumb = entry_point & 1;
>> > + register_t vcpuid;
>> > +
>> > + vcpuid = vaffinity_to_vcpuid(target_cpu);
>> > +
>> > + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
>> > + return PSCI_INVALID_PARAMETERS;
>> > +
>> > + /* THUMB set is not allowed with 64-bit domain */
>> > + if ( is_64bit_domain(d) && is_thumb )
>> > + return PSCI_INVALID_ADDRESS;
>> > +
>> > + if ( !test_bit(_VPF_down, &v->pause_flags) )
>> > + return PSCI_ALREADY_ON;
>> > +
>> > + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
>> > + if ( rc )
>> > return PSCI_DENIED;
>> >
>> > + vgic_clear_pending_irqs(v);
>> > vcpu_wake(v);
>> >
>> > return PSCI_SUCCESS;
>> > @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
>> > domain_shutdown(d,SHUTDOWN_reboot);
>> > }
>> >
>> > +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
>> > +{
>> > + int32_t rc;
>> > + struct vcpu *v;
>> > + struct domain *d = current->domain;
>> > + bool is_thumb = epoint & 1;
>> > +
>> > + /* THUMB set is not allowed with 64-bit domain */
>> > + if ( is_64bit_domain(d) && is_thumb )
>> > + return PSCI_INVALID_ADDRESS;
>> > +
>> > + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
>> > + if ( is_hardware_domain(d) )
>> > + return PSCI_NOT_SUPPORTED;
>> > +
>> > + /* Ensure that all CPUs other than the calling one are offline */
>> > + domain_lock(d);
>> > + for_each_vcpu ( d, v )
>> > + {
>> > + if ( v != current && is_vcpu_online(v) )
>> > + {
>> > + domain_unlock(d);
>> > + return PSCI_DENIED;
>> > + }
>> > + }
>> > + domain_unlock(d);
>> > +
>> > + rc = domain_shutdown(d, SHUTDOWN_suspend);
>> > + if ( rc )
>> > + return PSCI_DENIED;
>> > +
>> > + d->arch.resume_ctx.ep = epoint;
>> > + d->arch.resume_ctx.cid = cid;
>> > + d->arch.resume_ctx.wake_cpu = current;
>> > +
>> > + gprintk(XENLOG_DEBUG,
>> > + "SYSTEM_SUSPEND requested, epoint=%#"PRIregister", cid=%#"PRIregister"\n",
>> > + epoint, cid);
>> > +
>> > + return rc;
>> > +}
>> > +
>> > static int32_t do_psci_1_0_features(uint32_t psci_func_id)
>> > {
>> > /* /!\ Ordered by function ID and not name */
>> > @@ -214,6 +269,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
>> > case PSCI_0_2_FN32_SYSTEM_OFF:
>> > case PSCI_0_2_FN32_SYSTEM_RESET:
>> > case PSCI_1_0_FN32_PSCI_FEATURES:
>> > + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
>> > + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>> > case ARM_SMCCC_VERSION_FID:
>> > return 0;
>> > default:
>> > @@ -344,6 +401,23 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
>> > return true;
>> > }
>> >
>> > + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
>> > + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>> > + {
>> > + register_t epoint = PSCI_ARG(regs, 1);
>> > + register_t cid = PSCI_ARG(regs, 2);
>> > +
>> > + if ( fid == PSCI_1_0_FN32_SYSTEM_SUSPEND )
>> > + {
>> > + epoint &= GENMASK(31, 0);
>> > + cid &= GENMASK(31, 0);
>> > + }
>> > +
>> > + perfc_incr(vpsci_system_suspend);
>> > + PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
>> > + return true;
>> > + }
>> > +
>> > default:
>> > return false;
>> > }
>> > diff --git a/xen/common/domain.c b/xen/common/domain.c
>> > index 93c71bc766..09ad0a26ee 100644
>> > --- a/xen/common/domain.c
>> > +++ b/xen/common/domain.c
>> > @@ -26,6 +26,7 @@
>> > #include <xen/hypercall.h>
>> > #include <xen/delay.h>
>> > #include <xen/shutdown.h>
>> > +#include <xen/suspend.h>
>> > #include <xen/percpu.h>
>> > #include <xen/multicall.h>
>> > #include <xen/rcupdate.h>
>> > @@ -1374,6 +1375,7 @@ int domain_shutdown(struct domain *d, u8 reason)
>> > void domain_resume(struct domain *d)
>> > {
>> > struct vcpu *v;
>> > + int rc;
>> >
>> > /*
>> > * Some code paths assume that shutdown status does not get reset under
>> > @@ -1383,6 +1385,13 @@ void domain_resume(struct domain *d)
>> >
>> > spin_lock(&d->shutdown_lock);
>> >
>> > + rc = arch_domain_resume(d);
>> > + if ( rc )
>> > + {
>> > + printk("%pd: Failed to resume domain (ret %d)\n", d, rc);
>>
>> I am wondering about this error path... Domain clearly can't be resumed
>> anymore. Should we crash it in this case? But domain is already shut
>> down, so domain_crash() would do nothing.
>>
>> Probably it is better to ensure that arch_domain_resume() will not
>> return an error by doing all required checks beforehand. Actually you
>> already doing this. So how we can get an error realistically?
>
> vpsci_vcpu_up_prepare() can still fail on the resume path, currently due
> to alloc_vcpu_guest_context() or arch_set_info_guest().
>
> So I agree with your point: it would be better to move all failure-prone
> preparation and validation to the suspend path, so that
> arch_domain_resume() does not need to return an error in normal
> operation.
>
> The approach I see is:
> - allocate and populate the guest context during suspend;
> - split arch_set_info_guest() into two parts, something like
> arch_validate_vcpu_guest_context() and
> arch_apply_vcpu_guest_context();
> - call only the validation step during suspend;
> - on resume, only apply the already validated guest context and then
> free it.
>
> This should make the resume path effectively non-failing and avoid the
> questionable error handling there.
>
> Does this sound like the right direction?
Yes, sounds good for me.
--
WBR, Volodymyr
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2025-12-12 13:18 ` [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests Mykola Kvach
` (2 preceding siblings ...)
2026-03-18 4:18 ` Volodymyr Babchuk
@ 2026-03-19 9:36 ` Orzel, Michal
2026-03-19 10:14 ` Mykola Kvach
3 siblings, 1 reply; 15+ messages in thread
From: Orzel, Michal @ 2026-03-19 9:36 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
On 12/12/2025 14:18, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> (both 32-bit and 64-bit variants).
>
> Implementation details:
> - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> - Trap and handle SYSTEM_SUSPEND in vPSCI
> - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> in hwdom_shutdown() via domain_shutdown
> - Require all secondary VCPUs of the calling domain to be offline before
> suspend, as mandated by the PSCI specification
>
> The arch_domain_resume() function is an architecture-specific hook that is
> invoked during domain resume to perform any necessary setup or restoration
> steps required by the platform. arch_domain_resume() stays int to propagate
> errno-style detail into common logging; preserving the integer keeps the
> reason visible and leaves room for future arch-specific failures or richer
> handling.
>
> The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> the vCPU context (such as entry point, some system regs and context ID) before
> resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> code and avoids intrusive changes to the generic resume flow.
>
> Usage:
>
> For Linux-based guests, suspend can be initiated with:
> echo mem > /sys/power/state
> or via:
> systemctl suspend
>
> Resuming the guest is performed from control domain using:
> xl resume <domain>
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V16:
> - Refactor error handling in domain_resume: move logging to generic code,
> use explicit return code checking.
> - Make context clearing conditional on success in arch_domain_resume.
> - The 'int' return type is retained for arch_domain_resume for consistency
> with other arch hooks and to allow for specific negative error codes.
> ---
> xen/arch/arm/domain.c | 39 +++++++++
> xen/arch/arm/include/asm/domain.h | 2 +
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 2 +
> xen/arch/arm/include/asm/suspend.h | 27 ++++++
> xen/arch/arm/include/asm/vpsci.h | 5 +-
> xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> xen/common/domain.c | 10 +++
> xen/include/xen/suspend.h | 25 ++++++
> 9 files changed, 205 insertions(+), 22 deletions(-)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/include/xen/suspend.h
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 47973f99d9..f903e7d4f0 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -12,6 +12,8 @@
> #include <xen/softirq.h>
> #include <xen/wait.h>
>
> +#include <public/sched.h>
> +
> #include <asm/arm64/sve.h>
> #include <asm/cpuerrata.h>
> #include <asm/cpufeature.h>
> @@ -24,10 +26,12 @@
> #include <asm/platform.h>
> #include <asm/procinfo.h>
> #include <asm/regs.h>
> +#include <asm/suspend.h>
> #include <asm/firmware/sci.h>
> #include <asm/tee/tee.h>
> #include <asm/vfp.h>
> #include <asm/vgic.h>
> +#include <asm/vpsci.h>
> #include <asm/vtimer.h>
>
> #include "vpci.h"
> @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> p2m_domain_creation_finished(d);
> }
>
> +int arch_domain_resume(struct domain *d)
> +{
> + int rc;
> + struct resume_info *ctx = &d->arch.resume_ctx;
> +
> + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> + {
> + dprintk(XENLOG_WARNING,
> + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> + d, d->is_shutting_down, d->shutdown_code);
> + return -EINVAL;
> + }
> +
> + /*
> + * It is still possible to call domain_shutdown() with a suspend reason
> + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> + * In these cases, the resume context will be empty.
> + * This is not expected to cause any issues, so we just notify about the
> + * situation and return without error, allowing the existing logic to
> + * proceed as expected.
> + */
> + if ( !ctx->wake_cpu )
> + {
> + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
> + d);
> + return 0;
> + }
> +
> + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
> + if ( !rc )
> + memset(ctx, 0, sizeof(*ctx));
> +
> + return rc;
> +}
> +
> static int is_guest_pv32_psr(uint32_t psr)
> {
> switch (psr & PSR_MODE_MASK)
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 758ad807e4..66b1246892 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -5,6 +5,7 @@
> #include <xen/timer.h>
> #include <asm/page.h>
> #include <asm/p2m.h>
> +#include <asm/suspend.h>
> #include <asm/vfp.h>
> #include <asm/mmio.h>
> #include <asm/gic.h>
> @@ -126,6 +127,7 @@ struct arch_domain
> void *sci_data;
> #endif
>
> + struct resume_info resume_ctx;
> } __cacheline_aligned;
>
> struct arch_vcpu
> diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> index effd25b69e..8dfcac7e3b 100644
> --- a/xen/arch/arm/include/asm/perfc_defn.h
> +++ b/xen/arch/arm/include/asm/perfc_defn.h
> @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> PERFCOUNTER(vpsci_features, "vpsci: features")
> +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>
> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>
> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> index 4780972621..48a93e6b79 100644
> --- a/xen/arch/arm/include/asm/psci.h
> +++ b/xen/arch/arm/include/asm/psci.h
> @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>
> #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>
> /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> new file mode 100644
> index 0000000000..313d03ea59
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef ARM_SUSPEND_H
> +#define ARM_SUSPEND_H
> +
> +struct domain;
> +struct vcpu;
> +
> +struct resume_info {
> + register_t ep;
> + register_t cid;
> + struct vcpu *wake_cpu;
> +};
> +
> +int arch_domain_resume(struct domain *d);
> +
> +#endif /* ARM_SUSPEND_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
> index 0cca5e6830..d790ab3715 100644
> --- a/xen/arch/arm/include/asm/vpsci.h
> +++ b/xen/arch/arm/include/asm/vpsci.h
> @@ -23,12 +23,15 @@
> #include <asm/psci.h>
>
> /* Number of function implemented by virtual PSCI (only 0.2 or later) */
> -#define VPSCI_NR_FUNCS 12
> +#define VPSCI_NR_FUNCS 14
>
> /* Functions handle PSCI calls from the guests */
> bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
> bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
>
> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> + register_t context_id);
> +
> #endif /* __ASM_VPSCI_H__ */
>
> /*
> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> index 7ba9ccd94b..c4d616ec68 100644
> --- a/xen/arch/arm/vpsci.c
> +++ b/xen/arch/arm/vpsci.c
> @@ -10,32 +10,16 @@
>
> #include <public/sched.h>
>
> -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> - register_t context_id)
> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> + register_t context_id)
> {
> - struct vcpu *v;
> - struct domain *d = current->domain;
> - struct vcpu_guest_context *ctxt;
> int rc;
> + struct domain *d = v->domain;
> bool is_thumb = entry_point & 1;
> - register_t vcpuid;
> -
> - vcpuid = vaffinity_to_vcpuid(target_cpu);
> -
> - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> - return PSCI_INVALID_PARAMETERS;
> -
> - /* THUMB set is not allowed with 64-bit domain */
> - if ( is_64bit_domain(d) && is_thumb )
> - return PSCI_INVALID_ADDRESS;
> -
> - if ( !test_bit(_VPF_down, &v->pause_flags) )
> - return PSCI_ALREADY_ON;
> + struct vcpu_guest_context *ctxt;
>
> if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> - return PSCI_DENIED;
> -
> - vgic_clear_pending_irqs(v);
> + return -ENOMEM;
>
> memset(ctxt, 0, sizeof(*ctxt));
> ctxt->user_regs.pc64 = (u64) entry_point;
> @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> free_vcpu_guest_context(ctxt);
>
> if ( rc < 0 )
> + return rc;
> +
> + return 0;
> +}
> +
> +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> + register_t context_id)
> +{
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + int rc;
> + bool is_thumb = entry_point & 1;
> + register_t vcpuid;
> +
> + vcpuid = vaffinity_to_vcpuid(target_cpu);
> +
> + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> + return PSCI_INVALID_PARAMETERS;
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + if ( !test_bit(_VPF_down, &v->pause_flags) )
> + return PSCI_ALREADY_ON;
> +
> + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
> + if ( rc )
> return PSCI_DENIED;
>
> + vgic_clear_pending_irqs(v);
> vcpu_wake(v);
>
> return PSCI_SUCCESS;
> @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
> domain_shutdown(d,SHUTDOWN_reboot);
> }
>
> +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> +{
> + int32_t rc;
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + bool is_thumb = epoint & 1;
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> + if ( is_hardware_domain(d) )
> + return PSCI_NOT_SUPPORTED;
> +
> + /* Ensure that all CPUs other than the calling one are offline */
> + domain_lock(d);
> + for_each_vcpu ( d, v )
> + {
> + if ( v != current && is_vcpu_online(v) )
> + {
> + domain_unlock(d);
> + return PSCI_DENIED;
> + }
> + }
> + domain_unlock(d);
> +
> + rc = domain_shutdown(d, SHUTDOWN_suspend);
There is a theoretical race between this and ...
> + if ( rc )
> + return PSCI_DENIED;
> +
> + d->arch.resume_ctx.ep = epoint;
> + d->arch.resume_ctx.cid = cid;
> + d->arch.resume_ctx.wake_cpu = current;
setting up resume context. It is practically impossible but still we could move
setting up the context before domain_shutdown and clearing it on failure to be
race free. The race is about small window between domain_shutdown releasing the
shutdown_lock and other CPU calling domain_resume, taking the lock and running
arch_domain_resume before resume_ctx has been populated.
~Michal
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2026-03-19 9:36 ` Orzel, Michal
@ 2026-03-19 10:14 ` Mykola Kvach
2026-03-19 11:22 ` Orzel, Michal
0 siblings, 1 reply; 15+ messages in thread
From: Mykola Kvach @ 2026-03-19 10:14 UTC (permalink / raw)
To: Orzel, Michal
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
On Thu, Mar 19, 2026 at 11:37 AM Orzel, Michal <michal.orzel@amd.com> wrote:
>
>
>
> On 12/12/2025 14:18, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> > allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
> > (both 32-bit and 64-bit variants).
> >
> > Implementation details:
> > - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> > - Trap and handle SYSTEM_SUSPEND in vPSCI
> > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> > in hwdom_shutdown() via domain_shutdown
> > - Require all secondary VCPUs of the calling domain to be offline before
> > suspend, as mandated by the PSCI specification
> >
> > The arch_domain_resume() function is an architecture-specific hook that is
> > invoked during domain resume to perform any necessary setup or restoration
> > steps required by the platform. arch_domain_resume() stays int to propagate
> > errno-style detail into common logging; preserving the integer keeps the
> > reason visible and leaves room for future arch-specific failures or richer
> > handling.
> >
> > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
> > the vCPU context (such as entry point, some system regs and context ID) before
> > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
> > code and avoids intrusive changes to the generic resume flow.
> >
> > Usage:
> >
> > For Linux-based guests, suspend can be initiated with:
> > echo mem > /sys/power/state
> > or via:
> > systemctl suspend
> >
> > Resuming the guest is performed from control domain using:
> > xl resume <domain>
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V16:
> > - Refactor error handling in domain_resume: move logging to generic code,
> > use explicit return code checking.
> > - Make context clearing conditional on success in arch_domain_resume.
> > - The 'int' return type is retained for arch_domain_resume for consistency
> > with other arch hooks and to allow for specific negative error codes.
> > ---
> > xen/arch/arm/domain.c | 39 +++++++++
> > xen/arch/arm/include/asm/domain.h | 2 +
> > xen/arch/arm/include/asm/perfc_defn.h | 1 +
> > xen/arch/arm/include/asm/psci.h | 2 +
> > xen/arch/arm/include/asm/suspend.h | 27 ++++++
> > xen/arch/arm/include/asm/vpsci.h | 5 +-
> > xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
> > xen/common/domain.c | 10 +++
> > xen/include/xen/suspend.h | 25 ++++++
> > 9 files changed, 205 insertions(+), 22 deletions(-)
> > create mode 100644 xen/arch/arm/include/asm/suspend.h
> > create mode 100644 xen/include/xen/suspend.h
> >
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > index 47973f99d9..f903e7d4f0 100644
> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -12,6 +12,8 @@
> > #include <xen/softirq.h>
> > #include <xen/wait.h>
> >
> > +#include <public/sched.h>
> > +
> > #include <asm/arm64/sve.h>
> > #include <asm/cpuerrata.h>
> > #include <asm/cpufeature.h>
> > @@ -24,10 +26,12 @@
> > #include <asm/platform.h>
> > #include <asm/procinfo.h>
> > #include <asm/regs.h>
> > +#include <asm/suspend.h>
> > #include <asm/firmware/sci.h>
> > #include <asm/tee/tee.h>
> > #include <asm/vfp.h>
> > #include <asm/vgic.h>
> > +#include <asm/vpsci.h>
> > #include <asm/vtimer.h>
> >
> > #include "vpci.h"
> > @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
> > p2m_domain_creation_finished(d);
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + int rc;
> > + struct resume_info *ctx = &d->arch.resume_ctx;
> > +
> > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> > + {
> > + dprintk(XENLOG_WARNING,
> > + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
> > + d, d->is_shutting_down, d->shutdown_code);
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * It is still possible to call domain_shutdown() with a suspend reason
> > + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
> > + * In these cases, the resume context will be empty.
> > + * This is not expected to cause any issues, so we just notify about the
> > + * situation and return without error, allowing the existing logic to
> > + * proceed as expected.
> > + */
> > + if ( !ctx->wake_cpu )
> > + {
> > + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
> > + d);
> > + return 0;
> > + }
> > +
> > + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
> > + if ( !rc )
> > + memset(ctx, 0, sizeof(*ctx));
> > +
> > + return rc;
> > +}
> > +
> > static int is_guest_pv32_psr(uint32_t psr)
> > {
> > switch (psr & PSR_MODE_MASK)
> > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > index 758ad807e4..66b1246892 100644
> > --- a/xen/arch/arm/include/asm/domain.h
> > +++ b/xen/arch/arm/include/asm/domain.h
> > @@ -5,6 +5,7 @@
> > #include <xen/timer.h>
> > #include <asm/page.h>
> > #include <asm/p2m.h>
> > +#include <asm/suspend.h>
> > #include <asm/vfp.h>
> > #include <asm/mmio.h>
> > #include <asm/gic.h>
> > @@ -126,6 +127,7 @@ struct arch_domain
> > void *sci_data;
> > #endif
> >
> > + struct resume_info resume_ctx;
> > } __cacheline_aligned;
> >
> > struct arch_vcpu
> > diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> > index effd25b69e..8dfcac7e3b 100644
> > --- a/xen/arch/arm/include/asm/perfc_defn.h
> > +++ b/xen/arch/arm/include/asm/perfc_defn.h
> > @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> > PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> > PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> > PERFCOUNTER(vpsci_features, "vpsci: features")
> > +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
> >
> > PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
> >
> > diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> > index 4780972621..48a93e6b79 100644
> > --- a/xen/arch/arm/include/asm/psci.h
> > +++ b/xen/arch/arm/include/asm/psci.h
> > @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> > #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> > #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> > #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> > +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
> >
> > #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> > #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> > #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> > +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
> >
> > /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> > #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> > diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> > new file mode 100644
> > index 0000000000..313d03ea59
> > --- /dev/null
> > +++ b/xen/arch/arm/include/asm/suspend.h
> > @@ -0,0 +1,27 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#ifndef ARM_SUSPEND_H
> > +#define ARM_SUSPEND_H
> > +
> > +struct domain;
> > +struct vcpu;
> > +
> > +struct resume_info {
> > + register_t ep;
> > + register_t cid;
> > + struct vcpu *wake_cpu;
> > +};
> > +
> > +int arch_domain_resume(struct domain *d);
> > +
> > +#endif /* ARM_SUSPEND_H */
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
> > index 0cca5e6830..d790ab3715 100644
> > --- a/xen/arch/arm/include/asm/vpsci.h
> > +++ b/xen/arch/arm/include/asm/vpsci.h
> > @@ -23,12 +23,15 @@
> > #include <asm/psci.h>
> >
> > /* Number of function implemented by virtual PSCI (only 0.2 or later) */
> > -#define VPSCI_NR_FUNCS 12
> > +#define VPSCI_NR_FUNCS 14
> >
> > /* Functions handle PSCI calls from the guests */
> > bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
> > bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
> >
> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> > + register_t context_id);
> > +
> > #endif /* __ASM_VPSCI_H__ */
> >
> > /*
> > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> > index 7ba9ccd94b..c4d616ec68 100644
> > --- a/xen/arch/arm/vpsci.c
> > +++ b/xen/arch/arm/vpsci.c
> > @@ -10,32 +10,16 @@
> >
> > #include <public/sched.h>
> >
> > -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > - register_t context_id)
> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> > + register_t context_id)
> > {
> > - struct vcpu *v;
> > - struct domain *d = current->domain;
> > - struct vcpu_guest_context *ctxt;
> > int rc;
> > + struct domain *d = v->domain;
> > bool is_thumb = entry_point & 1;
> > - register_t vcpuid;
> > -
> > - vcpuid = vaffinity_to_vcpuid(target_cpu);
> > -
> > - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> > - return PSCI_INVALID_PARAMETERS;
> > -
> > - /* THUMB set is not allowed with 64-bit domain */
> > - if ( is_64bit_domain(d) && is_thumb )
> > - return PSCI_INVALID_ADDRESS;
> > -
> > - if ( !test_bit(_VPF_down, &v->pause_flags) )
> > - return PSCI_ALREADY_ON;
> > + struct vcpu_guest_context *ctxt;
> >
> > if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> > - return PSCI_DENIED;
> > -
> > - vgic_clear_pending_irqs(v);
> > + return -ENOMEM;
> >
> > memset(ctxt, 0, sizeof(*ctxt));
> > ctxt->user_regs.pc64 = (u64) entry_point;
> > @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > free_vcpu_guest_context(ctxt);
> >
> > if ( rc < 0 )
> > + return rc;
> > +
> > + return 0;
> > +}
> > +
> > +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > + register_t context_id)
> > +{
> > + struct vcpu *v;
> > + struct domain *d = current->domain;
> > + int rc;
> > + bool is_thumb = entry_point & 1;
> > + register_t vcpuid;
> > +
> > + vcpuid = vaffinity_to_vcpuid(target_cpu);
> > +
> > + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> > + return PSCI_INVALID_PARAMETERS;
> > +
> > + /* THUMB set is not allowed with 64-bit domain */
> > + if ( is_64bit_domain(d) && is_thumb )
> > + return PSCI_INVALID_ADDRESS;
> > +
> > + if ( !test_bit(_VPF_down, &v->pause_flags) )
> > + return PSCI_ALREADY_ON;
> > +
> > + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
> > + if ( rc )
> > return PSCI_DENIED;
> >
> > + vgic_clear_pending_irqs(v);
> > vcpu_wake(v);
> >
> > return PSCI_SUCCESS;
> > @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
> > domain_shutdown(d,SHUTDOWN_reboot);
> > }
> >
> > +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> > +{
> > + int32_t rc;
> > + struct vcpu *v;
> > + struct domain *d = current->domain;
> > + bool is_thumb = epoint & 1;
> > +
> > + /* THUMB set is not allowed with 64-bit domain */
> > + if ( is_64bit_domain(d) && is_thumb )
> > + return PSCI_INVALID_ADDRESS;
> > +
> > + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> > + if ( is_hardware_domain(d) )
> > + return PSCI_NOT_SUPPORTED;
> > +
> > + /* Ensure that all CPUs other than the calling one are offline */
> > + domain_lock(d);
> > + for_each_vcpu ( d, v )
> > + {
> > + if ( v != current && is_vcpu_online(v) )
> > + {
> > + domain_unlock(d);
> > + return PSCI_DENIED;
> > + }
> > + }
> > + domain_unlock(d);
> > +
> > + rc = domain_shutdown(d, SHUTDOWN_suspend);
> There is a theoretical race between this and ...
> > + if ( rc )
> > + return PSCI_DENIED;
> > +
> > + d->arch.resume_ctx.ep = epoint;
> > + d->arch.resume_ctx.cid = cid;
> > + d->arch.resume_ctx.wake_cpu = current;
> setting up resume context. It is practically impossible but still we could move
> setting up the context before domain_shutdown and clearing it on failure to be
> race free. The race is about small window between domain_shutdown releasing the
> shutdown_lock and other CPU calling domain_resume, taking the lock and running
> arch_domain_resume before resume_ctx has been populated.
I think the current flow is already serialized, so I don't believe
arch_domain_resume() can observe an uninitialized resume_ctx here.
A concurrent domain_resume() does not take shutdown_lock first. It starts
with domain_pause(d), and that path is synchronous: it walks all domain
vCPUs and uses vcpu_sleep_sync().
In the SYSTEM_SUSPEND path, the calling vCPU is still running after
domain_shutdown() returns, and it populates resume_ctx before leaving the
hypercall path. So if another CPU tries to resume the domain in that
window, it should block in domain_pause() until that vCPU stops running.
Only after that can it take shutdown_lock and proceed to
arch_domain_resume().
So, as far as I can see, arch_domain_resume() should not be able to
observe an uninitialized resume_ctx in the current flow.
Also, moving resume_ctx setup before domain_shutdown() would slightly
complicate the failure path, since we would then need to explicitly
clear/invalidate the pre-populated context if domain_shutdown() fails.
So unless we think this race is actually reachable, I would prefer to
keep the current ordering. That said, if you strongly prefer making the
ordering more explicit, I can rework it that way.
Best regards,
Mykola
>
> ~Michal
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v16 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
2026-03-19 10:14 ` Mykola Kvach
@ 2026-03-19 11:22 ` Orzel, Michal
0 siblings, 0 replies; 15+ messages in thread
From: Orzel, Michal @ 2026-03-19 11:22 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
On 19/03/2026 11:14, Mykola Kvach wrote:
> On Thu, Mar 19, 2026 at 11:37 AM Orzel, Michal <michal.orzel@amd.com> wrote:
>>
>>
>>
>> On 12/12/2025 14:18, Mykola Kvach wrote:
>>> From: Mykola Kvach <mykola_kvach@epam.com>
>>>
>>> Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
>>> allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call
>>> (both 32-bit and 64-bit variants).
>>>
>>> Implementation details:
>>> - Add SYSTEM_SUSPEND function IDs to PSCI definitions
>>> - Trap and handle SYSTEM_SUSPEND in vPSCI
>>> - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
>>> PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
>>> in hwdom_shutdown() via domain_shutdown
>>> - Require all secondary VCPUs of the calling domain to be offline before
>>> suspend, as mandated by the PSCI specification
>>>
>>> The arch_domain_resume() function is an architecture-specific hook that is
>>> invoked during domain resume to perform any necessary setup or restoration
>>> steps required by the platform. arch_domain_resume() stays int to propagate
>>> errno-style detail into common logging; preserving the integer keeps the
>>> reason visible and leaves room for future arch-specific failures or richer
>>> handling.
>>>
>>> The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up
>>> the vCPU context (such as entry point, some system regs and context ID) before
>>> resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common
>>> code and avoids intrusive changes to the generic resume flow.
>>>
>>> Usage:
>>>
>>> For Linux-based guests, suspend can be initiated with:
>>> echo mem > /sys/power/state
>>> or via:
>>> systemctl suspend
>>>
>>> Resuming the guest is performed from control domain using:
>>> xl resume <domain>
>>>
>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>> ---
>>> Changes in V16:
>>> - Refactor error handling in domain_resume: move logging to generic code,
>>> use explicit return code checking.
>>> - Make context clearing conditional on success in arch_domain_resume.
>>> - The 'int' return type is retained for arch_domain_resume for consistency
>>> with other arch hooks and to allow for specific negative error codes.
>>> ---
>>> xen/arch/arm/domain.c | 39 +++++++++
>>> xen/arch/arm/include/asm/domain.h | 2 +
>>> xen/arch/arm/include/asm/perfc_defn.h | 1 +
>>> xen/arch/arm/include/asm/psci.h | 2 +
>>> xen/arch/arm/include/asm/suspend.h | 27 ++++++
>>> xen/arch/arm/include/asm/vpsci.h | 5 +-
>>> xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++-----
>>> xen/common/domain.c | 10 +++
>>> xen/include/xen/suspend.h | 25 ++++++
>>> 9 files changed, 205 insertions(+), 22 deletions(-)
>>> create mode 100644 xen/arch/arm/include/asm/suspend.h
>>> create mode 100644 xen/include/xen/suspend.h
>>>
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 47973f99d9..f903e7d4f0 100644
>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -12,6 +12,8 @@
>>> #include <xen/softirq.h>
>>> #include <xen/wait.h>
>>>
>>> +#include <public/sched.h>
>>> +
>>> #include <asm/arm64/sve.h>
>>> #include <asm/cpuerrata.h>
>>> #include <asm/cpufeature.h>
>>> @@ -24,10 +26,12 @@
>>> #include <asm/platform.h>
>>> #include <asm/procinfo.h>
>>> #include <asm/regs.h>
>>> +#include <asm/suspend.h>
>>> #include <asm/firmware/sci.h>
>>> #include <asm/tee/tee.h>
>>> #include <asm/vfp.h>
>>> #include <asm/vgic.h>
>>> +#include <asm/vpsci.h>
>>> #include <asm/vtimer.h>
>>>
>>> #include "vpci.h"
>>> @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d)
>>> p2m_domain_creation_finished(d);
>>> }
>>>
>>> +int arch_domain_resume(struct domain *d)
>>> +{
>>> + int rc;
>>> + struct resume_info *ctx = &d->arch.resume_ctx;
>>> +
>>> + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
>>> + {
>>> + dprintk(XENLOG_WARNING,
>>> + "%pd: Invalid domain state for resume: is_shutting_down=%u, shutdown_code=%u\n",
>>> + d, d->is_shutting_down, d->shutdown_code);
>>> + return -EINVAL;
>>> + }
>>> +
>>> + /*
>>> + * It is still possible to call domain_shutdown() with a suspend reason
>>> + * via some hypercalls, such as SCHEDOP_shutdown or SCHEDOP_remote_shutdown.
>>> + * In these cases, the resume context will be empty.
>>> + * This is not expected to cause any issues, so we just notify about the
>>> + * situation and return without error, allowing the existing logic to
>>> + * proceed as expected.
>>> + */
>>> + if ( !ctx->wake_cpu )
>>> + {
>>> + dprintk(XENLOG_INFO, "%pd: Wake CPU pointer context was not provided\n",
>>> + d);
>>> + return 0;
>>> + }
>>> +
>>> + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
>>> + if ( !rc )
>>> + memset(ctx, 0, sizeof(*ctx));
>>> +
>>> + return rc;
>>> +}
>>> +
>>> static int is_guest_pv32_psr(uint32_t psr)
>>> {
>>> switch (psr & PSR_MODE_MASK)
>>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>>> index 758ad807e4..66b1246892 100644
>>> --- a/xen/arch/arm/include/asm/domain.h
>>> +++ b/xen/arch/arm/include/asm/domain.h
>>> @@ -5,6 +5,7 @@
>>> #include <xen/timer.h>
>>> #include <asm/page.h>
>>> #include <asm/p2m.h>
>>> +#include <asm/suspend.h>
>>> #include <asm/vfp.h>
>>> #include <asm/mmio.h>
>>> #include <asm/gic.h>
>>> @@ -126,6 +127,7 @@ struct arch_domain
>>> void *sci_data;
>>> #endif
>>>
>>> + struct resume_info resume_ctx;
>>> } __cacheline_aligned;
>>>
>>> struct arch_vcpu
>>> diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
>>> index effd25b69e..8dfcac7e3b 100644
>>> --- a/xen/arch/arm/include/asm/perfc_defn.h
>>> +++ b/xen/arch/arm/include/asm/perfc_defn.h
>>> @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
>>> PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
>>> PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
>>> PERFCOUNTER(vpsci_features, "vpsci: features")
>>> +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>>>
>>> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>>>
>>> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
>>> index 4780972621..48a93e6b79 100644
>>> --- a/xen/arch/arm/include/asm/psci.h
>>> +++ b/xen/arch/arm/include/asm/psci.h
>>> @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
>>> #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
>>> #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
>>> #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
>>> +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>>>
>>> #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
>>> #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
>>> #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
>>> +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>>>
>>> /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
>>> #define PSCI_0_2_AFFINITY_LEVEL_ON 0
>>> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
>>> new file mode 100644
>>> index 0000000000..313d03ea59
>>> --- /dev/null
>>> +++ b/xen/arch/arm/include/asm/suspend.h
>>> @@ -0,0 +1,27 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +
>>> +#ifndef ARM_SUSPEND_H
>>> +#define ARM_SUSPEND_H
>>> +
>>> +struct domain;
>>> +struct vcpu;
>>> +
>>> +struct resume_info {
>>> + register_t ep;
>>> + register_t cid;
>>> + struct vcpu *wake_cpu;
>>> +};
>>> +
>>> +int arch_domain_resume(struct domain *d);
>>> +
>>> +#endif /* ARM_SUSPEND_H */
>>> +
>>> +/*
>>> + * Local variables:
>>> + * mode: C
>>> + * c-file-style: "BSD"
>>> + * c-basic-offset: 4
>>> + * tab-width: 4
>>> + * indent-tabs-mode: nil
>>> + * End:
>>> + */
>>> diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
>>> index 0cca5e6830..d790ab3715 100644
>>> --- a/xen/arch/arm/include/asm/vpsci.h
>>> +++ b/xen/arch/arm/include/asm/vpsci.h
>>> @@ -23,12 +23,15 @@
>>> #include <asm/psci.h>
>>>
>>> /* Number of function implemented by virtual PSCI (only 0.2 or later) */
>>> -#define VPSCI_NR_FUNCS 12
>>> +#define VPSCI_NR_FUNCS 14
>>>
>>> /* Functions handle PSCI calls from the guests */
>>> bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
>>> bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
>>>
>>> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
>>> + register_t context_id);
>>> +
>>> #endif /* __ASM_VPSCI_H__ */
>>>
>>> /*
>>> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
>>> index 7ba9ccd94b..c4d616ec68 100644
>>> --- a/xen/arch/arm/vpsci.c
>>> +++ b/xen/arch/arm/vpsci.c
>>> @@ -10,32 +10,16 @@
>>>
>>> #include <public/sched.h>
>>>
>>> -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
>>> - register_t context_id)
>>> +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
>>> + register_t context_id)
>>> {
>>> - struct vcpu *v;
>>> - struct domain *d = current->domain;
>>> - struct vcpu_guest_context *ctxt;
>>> int rc;
>>> + struct domain *d = v->domain;
>>> bool is_thumb = entry_point & 1;
>>> - register_t vcpuid;
>>> -
>>> - vcpuid = vaffinity_to_vcpuid(target_cpu);
>>> -
>>> - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
>>> - return PSCI_INVALID_PARAMETERS;
>>> -
>>> - /* THUMB set is not allowed with 64-bit domain */
>>> - if ( is_64bit_domain(d) && is_thumb )
>>> - return PSCI_INVALID_ADDRESS;
>>> -
>>> - if ( !test_bit(_VPF_down, &v->pause_flags) )
>>> - return PSCI_ALREADY_ON;
>>> + struct vcpu_guest_context *ctxt;
>>>
>>> if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>>> - return PSCI_DENIED;
>>> -
>>> - vgic_clear_pending_irqs(v);
>>> + return -ENOMEM;
>>>
>>> memset(ctxt, 0, sizeof(*ctxt));
>>> ctxt->user_regs.pc64 = (u64) entry_point;
>>> @@ -76,8 +60,37 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
>>> free_vcpu_guest_context(ctxt);
>>>
>>> if ( rc < 0 )
>>> + return rc;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
>>> + register_t context_id)
>>> +{
>>> + struct vcpu *v;
>>> + struct domain *d = current->domain;
>>> + int rc;
>>> + bool is_thumb = entry_point & 1;
>>> + register_t vcpuid;
>>> +
>>> + vcpuid = vaffinity_to_vcpuid(target_cpu);
>>> +
>>> + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
>>> + return PSCI_INVALID_PARAMETERS;
>>> +
>>> + /* THUMB set is not allowed with 64-bit domain */
>>> + if ( is_64bit_domain(d) && is_thumb )
>>> + return PSCI_INVALID_ADDRESS;
>>> +
>>> + if ( !test_bit(_VPF_down, &v->pause_flags) )
>>> + return PSCI_ALREADY_ON;
>>> +
>>> + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
>>> + if ( rc )
>>> return PSCI_DENIED;
>>>
>>> + vgic_clear_pending_irqs(v);
>>> vcpu_wake(v);
>>>
>>> return PSCI_SUCCESS;
>>> @@ -197,6 +210,48 @@ static void do_psci_0_2_system_reset(void)
>>> domain_shutdown(d,SHUTDOWN_reboot);
>>> }
>>>
>>> +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
>>> +{
>>> + int32_t rc;
>>> + struct vcpu *v;
>>> + struct domain *d = current->domain;
>>> + bool is_thumb = epoint & 1;
>>> +
>>> + /* THUMB set is not allowed with 64-bit domain */
>>> + if ( is_64bit_domain(d) && is_thumb )
>>> + return PSCI_INVALID_ADDRESS;
>>> +
>>> + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
>>> + if ( is_hardware_domain(d) )
>>> + return PSCI_NOT_SUPPORTED;
>>> +
>>> + /* Ensure that all CPUs other than the calling one are offline */
>>> + domain_lock(d);
>>> + for_each_vcpu ( d, v )
>>> + {
>>> + if ( v != current && is_vcpu_online(v) )
>>> + {
>>> + domain_unlock(d);
>>> + return PSCI_DENIED;
>>> + }
>>> + }
>>> + domain_unlock(d);
>>> +
>>> + rc = domain_shutdown(d, SHUTDOWN_suspend);
>> There is a theoretical race between this and ...
>>> + if ( rc )
>>> + return PSCI_DENIED;
>>> +
>>> + d->arch.resume_ctx.ep = epoint;
>>> + d->arch.resume_ctx.cid = cid;
>>> + d->arch.resume_ctx.wake_cpu = current;
>> setting up resume context. It is practically impossible but still we could move
>> setting up the context before domain_shutdown and clearing it on failure to be
>> race free. The race is about small window between domain_shutdown releasing the
>> shutdown_lock and other CPU calling domain_resume, taking the lock and running
>> arch_domain_resume before resume_ctx has been populated.
>
> I think the current flow is already serialized, so I don't believe
> arch_domain_resume() can observe an uninitialized resume_ctx here.
>
> A concurrent domain_resume() does not take shutdown_lock first. It starts
> with domain_pause(d), and that path is synchronous: it walks all domain
> vCPUs and uses vcpu_sleep_sync().
>
> In the SYSTEM_SUSPEND path, the calling vCPU is still running after
> domain_shutdown() returns, and it populates resume_ctx before leaving the
> hypercall path. So if another CPU tries to resume the domain in that
> window, it should block in domain_pause() until that vCPU stops running.
> Only after that can it take shutdown_lock and proceed to
> arch_domain_resume().
>
> So, as far as I can see, arch_domain_resume() should not be able to
> observe an uninitialized resume_ctx in the current flow.
>
> Also, moving resume_ctx setup before domain_shutdown() would slightly
> complicate the failure path, since we would then need to explicitly
> clear/invalidate the pre-populated context if domain_shutdown() fails.
>
> So unless we think this race is actually reachable, I would prefer to
> keep the current ordering. That said, if you strongly prefer making the
> ordering more explicit, I can rework it that way.
No, you're right. Resuming CPU will block until the suspending vCPU has fully
stopped, so the race is impossible.
~Michal
^ permalink raw reply [flat|nested] 15+ messages in thread