* [PATCH v9 01/13] xen/arm: Add suspend and resume timer helpers
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
` (11 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Julien Grall, Luca Fancellu
From: Mirela Simonovic <mirela.simonovic@aggios.com>
Timer interrupts must be disabled while the system is suspended to prevent
spurious wake-ups. Suspending timers in Xen consists of disabling the
physical timer and the hypervisor timer on the current CPU. The virtual
timer does not need explicit handling here, as it is already disabled on
vCPU context switch and its state is restored per-vCPU on the next context
restore.
Resuming consists of raising TIMER_SOFTIRQ, which prompts the generic
timer code to reprogram the hypervisor timer with the correct timeout.
Xen does not use or expose the physical timer, so it remains disabled
across suspend/resume.
Introduce a new helper, disable_phys_hyp_timers(), to encapsulate disabling
of the physical and hypervisor timers.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes in V7:
- Dropped EL1/EL2 wording; use "physical timer" and "hypervisor timer"
- Renamed helper to disable_phys_hyp_timers() to reflect its actual scope
- Clarified virtual timer handling (disabled on vCPU switch-out, restored
on context restore) and added comments in suspend/resume paths
- Added resume comment explaining which timers are restored by
TIMER_SOFTIRQ
---
xen/arch/arm/include/asm/time.h | 5 ++++
xen/arch/arm/time.c | 44 ++++++++++++++++++++++++++++-----
2 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/xen/arch/arm/include/asm/time.h b/xen/arch/arm/include/asm/time.h
index c194dbb9f5..9313b157ea 100644
--- a/xen/arch/arm/include/asm/time.h
+++ b/xen/arch/arm/include/asm/time.h
@@ -105,6 +105,11 @@ void preinit_xen_time(void);
void force_update_vcpu_system_time(struct vcpu *v);
+#ifdef CONFIG_SYSTEM_SUSPEND
+void time_suspend(void);
+void time_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
#endif /* __ARM_TIME_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index 6955b2788f..fff8e4aca6 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -296,6 +296,14 @@ static void check_timer_irq_cfg(unsigned int irq, const char *which)
static DEFINE_PER_CPU_READ_MOSTLY(struct irqaction, irq_hyp);
static DEFINE_PER_CPU_READ_MOSTLY(struct irqaction, irq_virt);
+/* Disable physical and hypervisor timers on the current CPU */
+static inline void disable_phys_hyp_timers(void)
+{
+ WRITE_SYSREG(0, CNTP_CTL_EL0); /* Physical timer disabled */
+ WRITE_SYSREG(0, CNTHP_CTL_EL2); /* Hypervisor's timer disabled */
+ isb();
+}
+
/* Set up the timer interrupt on this CPU */
void init_timer_interrupt(void)
{
@@ -306,9 +314,7 @@ void init_timer_interrupt(void)
WRITE_SYSREG64(0, CNTVOFF_EL2); /* No VM-specific offset */
/* Do not let the VMs program the physical timer, only read the physical counter */
WRITE_SYSREG(CNTHCTL_EL2_EL1PCTEN, CNTHCTL_EL2);
- WRITE_SYSREG(0, CNTP_CTL_EL0); /* Physical timer disabled */
- WRITE_SYSREG(0, CNTHP_CTL_EL2); /* Hypervisor's timer disabled */
- isb();
+ disable_phys_hyp_timers();
hyp_action->name = "hyptimer";
hyp_action->handler = htimer_interrupt;
@@ -333,9 +339,7 @@ void init_timer_interrupt(void)
*/
static void deinit_timer_interrupt(void)
{
- WRITE_SYSREG(0, CNTP_CTL_EL0); /* Disable physical timer */
- WRITE_SYSREG(0, CNTHP_CTL_EL2); /* Disable hypervisor's timer */
- isb();
+ disable_phys_hyp_timers();
release_irq(timer_irq[TIMER_HYP_PPI], NULL);
release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
@@ -375,6 +379,34 @@ void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds)
/* XXX update guest visible wallclock time */
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+void time_suspend(void)
+{
+ /* CNTV already disabled by virt_timer_save() during vcpu context switch. */
+ disable_phys_hyp_timers();
+}
+
+void time_resume(void)
+{
+ /*
+ * Raising TIMER_SOFTIRQ triggers generic timer code to reprogram the
+ * hypervisor timer with the correct timeout (not known here).
+ *
+ * Xen doesn't use or expose the physical timer, so it remains disabled
+ * across suspend/resume.
+ *
+ * The virtual timer state is restored per-vCPU on the next context switch.
+ *
+ * No further action is needed to restore timekeeping after power down,
+ * since the system counter is unaffected. See ARM DDI 0487 L.a, D12.1.2
+ * "The system counter must be implemented in an always-on power domain."
+ */
+ raise_softirq(TIMER_SOFTIRQ);
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
static int cpu_time_callback(struct notifier_block *nfb,
unsigned long action,
void *hcpu)
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 01/13] xen/arm: Add suspend and resume timer helpers Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-13 14:08 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 03/13] xen/arm: gic-v3: tolerate retained redistributor LPI state across CPU_OFF Mykola Kvach
` (10 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Mirela Simonovic <mirela.simonovic@aggios.com>
System suspend may lead to a state where GIC would be powered down.
Therefore, Xen should save/restore the context of GIC on suspend/resume.
Note that the context consists of states of registers which are
controlled by the hypervisor. Other GIC registers which are accessible
by guests are saved/restored on context switch.
Transient physical SGI pending state (GICD_CPENDSGIRn/GICD_SPENDSGIRn)
is intentionally excluded. CPU-interface active-priority state is also
not restored across suspend/resume. Xen reaches the final suspend path
at a quiescent point, so there is no active-priority execution context
to replay after resume. Enforce this with a runtime check after
disabling the CPU interface: if any implemented GICC_APRn word is still
non-zero, restore GICC_CTLR and abort suspend with -EBUSY.
This does not apply to distributor active state. With GICv2 EOImode==1,
EOIR only drops the interrupt priority; final deactivation is a separate
step. For guest-routed interrupts, Xen can have already EOIed the physical
IRQ while deactivation is still pending on the vGIC/GICV path. Therefore
GICD_ISACTIVER is preserved as architectural in-flight interrupt state.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V9:
- Skip saving/restoring GICD_ITARGETSR0..7 because SGI/PPI target
registers hold no state (read-only on MP, RAZ/WI on UP).
- Add a runtime GICC_APRn quiescence check after disabling the CPU
interface, and restore GICC_CTLR before returning -EBUSY.
Changes in V8:
- disable cpu interface + distributor before suspend
- change 0xffffffff to GENMASK;
- cosmetic changes;
Changes in V7:
- Allocate one contiguous memory block for the GICv2 dist suspend context.
- gicv2_resume() no longer unconditionally re-enables the distributor/CPU
interface; it now writes back the saved CTLR values as-is.
- gicv2_alloc_context() now returns 0 on success and panics on failure,
since suspend context allocation is not recoverable.
---
xen/arch/arm/gic-v2.c | 181 +++++++++++++++++++++++++++++++++
xen/arch/arm/gic.c | 29 ++++++
xen/arch/arm/include/asm/gic.h | 12 +++
3 files changed, 222 insertions(+)
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 43a379fdda..1970c4feb9 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1108,6 +1108,178 @@ static int gicv2_iomem_deny_access(struct domain *d)
return iomem_deny_access(d, mfn, mfn + nr - 1);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+/* This struct represents block of 32 IRQs */
+struct irq_block {
+ uint32_t icfgr[2]; /* 2 registers of 16 IRQs each */
+ uint32_t ipriorityr[8];
+ uint32_t isenabler;
+ uint32_t isactiver;
+ uint32_t itargetsr[8];
+};
+
+/* GICv2 registers to be saved/restored on system suspend/resume */
+struct gicv2_context {
+ /* GICC context */
+ struct cpu_ctx {
+ uint32_t ctlr;
+ uint32_t pmr;
+ uint32_t bpr;
+ } cpu;
+
+ /* GICD context */
+ struct dist_ctx {
+ uint32_t ctlr;
+ /* Includes banked SGI/PPI state for the boot CPU. */
+ struct irq_block *irqs;
+ } dist;
+};
+
+static struct gicv2_context gic_ctx;
+
+#define GICV2_NR_APRS 4
+
+static int gicv2_suspend(void)
+{
+ unsigned int i, blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
+
+ /* Save GICC_CTLR configuration. */
+ gic_ctx.cpu.ctlr = readl_gicc(GICC_CTLR);
+
+ /* Quiesce the GIC CPU interface before suspend. */
+ gicv2_cpu_disable();
+
+ /*
+ * Check the active-priority state for the group Xen drives through the
+ * CPU interface. GICC_CTL_ENABLE enables Group 0 without SecurityExtn and
+ * Group 1 in Xen's Non-secure view with SecurityExtn, and in both cases
+ * the relevant state is visible through GICC_APRn. Only zero/non-zero
+ * matters here, not the implementation-defined bit layout.
+ */
+ for ( i = 0; i < GICV2_NR_APRS; i++ )
+ {
+ uint32_t apr = readl_gicc(GICC_APR + i * 4);
+
+ if ( !apr )
+ continue;
+
+ printk(XENLOG_ERR "GICv2: suspend aborted: GICC_APR%u=%#08x\n", i, apr);
+ writel_gicc(gic_ctx.cpu.ctlr, GICC_CTLR);
+ return -EBUSY;
+ }
+
+ /* Save GICD configuration */
+ gic_ctx.dist.ctlr = readl_gicd(GICD_CTLR);
+ writel_gicd(0, GICD_CTLR);
+
+ gic_ctx.cpu.pmr = readl_gicc(GICC_PMR);
+ gic_ctx.cpu.bpr = readl_gicc(GICC_BPR);
+
+ for ( i = 0; i < blocks; i++ )
+ {
+ struct irq_block *irqs = gic_ctx.dist.irqs + i;
+ size_t j, off = i * sizeof(irqs->isenabler);
+
+ irqs->isenabler = readl_gicd(GICD_ISENABLER + off);
+
+ /*
+ * Save distributor active state as part of the hypervisor-owned
+ * physical interrupt state. In GICv2 EOImode==1, EOIR only drops the
+ * priority; final deactivation is separate. For guest-routed
+ * interrupts, Xen may have EOIed the physical IRQ while the guest/vGIC
+ * side still owns the deactivate step. Therefore GICD_ISACTIVER can
+ * legitimately remain set even though transient SGI pending state and
+ * CPU-interface active-priority state are expected to be quiesced here.
+ */
+ irqs->isactiver = readl_gicd(GICD_ISACTIVER + off);
+
+ off = i * sizeof(irqs->ipriorityr);
+ for ( j = 0; j < ARRAY_SIZE(irqs->ipriorityr); j++ )
+ irqs->ipriorityr[j] = readl_gicd(GICD_IPRIORITYR + off + j * 4);
+
+ /*
+ * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
+ * they are read-only on multiprocessor implementations and RAZ/WI
+ * on uniprocessor implementations.
+ */
+ if ( i )
+ {
+ off = i * sizeof(irqs->itargetsr);
+ for ( j = 0; j < ARRAY_SIZE(irqs->itargetsr); j++ )
+ irqs->itargetsr[j] = readl_gicd(GICD_ITARGETSR + off + j * 4);
+ }
+
+ off = i * sizeof(irqs->icfgr);
+ for ( j = 0; j < ARRAY_SIZE(irqs->icfgr); j++ )
+ irqs->icfgr[j] = readl_gicd(GICD_ICFGR + off + j * 4);
+ }
+
+ return 0;
+}
+
+static void gicv2_resume(void)
+{
+ unsigned int i, blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
+
+ gicv2_cpu_disable();
+ /* Disable distributor */
+ writel_gicd(0, GICD_CTLR);
+
+ for ( i = 0; i < blocks; i++ )
+ {
+ struct irq_block *irqs = gic_ctx.dist.irqs + i;
+ size_t j, off = i * sizeof(irqs->isenabler);
+
+ writel_gicd(GENMASK(31, 0), GICD_ICENABLER + off);
+ writel_gicd(irqs->isenabler, GICD_ISENABLER + off);
+
+ writel_gicd(GENMASK(31, 0), GICD_ICACTIVER + off);
+ writel_gicd(irqs->isactiver, GICD_ISACTIVER + off);
+
+ off = i * sizeof(irqs->ipriorityr);
+ for ( j = 0; j < ARRAY_SIZE(irqs->ipriorityr); j++ )
+ writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
+
+ /*
+ * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
+ * they are read-only on multiprocessor implementations and RAZ/WI
+ * on uniprocessor implementations.
+ */
+ if ( i )
+ {
+ off = i * sizeof(irqs->itargetsr);
+ for ( j = 0; j < ARRAY_SIZE(irqs->itargetsr); j++ )
+ writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
+ }
+
+ off = i * sizeof(irqs->icfgr);
+ for ( j = 0; j < ARRAY_SIZE(irqs->icfgr); j++ )
+ writel_gicd(irqs->icfgr[j], GICD_ICFGR + off + j * 4);
+ }
+
+ /* Make sure all registers are restored and enable distributor */
+ writel_gicd(gic_ctx.dist.ctlr, GICD_CTLR);
+
+ /* Restore GIC CPU interface configuration */
+ writel_gicc(gic_ctx.cpu.pmr, GICC_PMR);
+ writel_gicc(gic_ctx.cpu.bpr, GICC_BPR);
+
+ /* Enable GIC CPU interface */
+ writel_gicc(gic_ctx.cpu.ctlr, GICC_CTLR);
+}
+
+static void __init gicv2_alloc_context(void)
+{
+ uint32_t blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
+
+ gic_ctx.dist.irqs = xzalloc_array(struct irq_block, blocks);
+ if ( !gic_ctx.dist.irqs )
+ panic("Failed to allocate memory for GICv2 suspend context\n");
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
#ifdef CONFIG_ACPI
static unsigned long gicv2_get_hwdom_extra_madt_size(const struct domain *d)
{
@@ -1312,6 +1484,11 @@ static int __init gicv2_init(void)
spin_unlock(&gicv2.lock);
+#ifdef CONFIG_SYSTEM_SUSPEND
+ /* Allocate memory to be used for saving GIC context during the suspend */
+ gicv2_alloc_context();
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
return 0;
}
@@ -1355,6 +1532,10 @@ static const struct gic_hw_operations gicv2_ops = {
.map_hwdom_extra_mappings = gicv2_map_hwdom_extra_mappings,
.iomem_deny_access = gicv2_iomem_deny_access,
.do_LPI = gicv2_do_LPI,
+#ifdef CONFIG_SYSTEM_SUSPEND
+ .suspend = gicv2_suspend,
+ .resume = gicv2_resume,
+#endif /* CONFIG_SYSTEM_SUSPEND */
};
/* Set up the GIC */
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index ee75258fc3..7727ffed5a 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -432,6 +432,35 @@ int gic_iomem_deny_access(struct domain *d)
return gic_hw_ops->iomem_deny_access(d);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+int gic_suspend(void)
+{
+ /* Must be called by boot CPU#0 with interrupts disabled */
+ ASSERT(!local_irq_is_enabled());
+ ASSERT(!smp_processor_id());
+
+ if ( !gic_hw_ops->suspend || !gic_hw_ops->resume )
+ return -ENOSYS;
+
+ return gic_hw_ops->suspend();
+}
+
+void gic_resume(void)
+{
+ /*
+ * Must be called by boot CPU#0 with interrupts disabled after gic_suspend
+ * has returned successfully.
+ */
+ ASSERT(!local_irq_is_enabled());
+ ASSERT(!smp_processor_id());
+ ASSERT(gic_hw_ops->resume);
+
+ gic_hw_ops->resume();
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
static int cpu_gic_callback(struct notifier_block *nfb,
unsigned long action,
void *hcpu)
diff --git a/xen/arch/arm/include/asm/gic.h b/xen/arch/arm/include/asm/gic.h
index ff22dea40d..fbf0d69edd 100644
--- a/xen/arch/arm/include/asm/gic.h
+++ b/xen/arch/arm/include/asm/gic.h
@@ -301,6 +301,12 @@ extern int gicv_setup(struct domain *d);
extern void gic_save_state(struct vcpu *v);
extern void gic_restore_state(struct vcpu *v);
+#ifdef CONFIG_SYSTEM_SUSPEND
+/* Suspend/resume */
+extern int gic_suspend(void);
+extern void gic_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/* SGI (AKA IPIs) */
enum gic_sgi {
GIC_SGI_EVENT_CHECK,
@@ -444,6 +450,12 @@ struct gic_hw_operations {
int (*iomem_deny_access)(struct domain *d);
/* Handle LPIs, which require special handling */
void (*do_LPI)(unsigned int lpi);
+#ifdef CONFIG_SYSTEM_SUSPEND
+ /* Save GIC configuration due to the system suspend */
+ int (*suspend)(void);
+ /* Restore GIC configuration due to the system resume */
+ void (*resume)(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
};
extern const struct gic_hw_operations *gic_hw_ops;
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions
2026-05-12 17:07 ` [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
@ 2026-05-13 14:08 ` Luca Fancellu
2026-05-15 7:59 ` Mykola Kvach
0 siblings, 1 reply; 29+ messages in thread
From: Luca Fancellu @ 2026-05-13 14:08 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
> +
> +static void gicv2_resume(void)
> +{
> + unsigned int i, blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
> +
> + gicv2_cpu_disable();
> + /* Disable distributor */
> + writel_gicd(0, GICD_CTLR);
> +
> + for ( i = 0; i < blocks; i++ )
> + {
> + struct irq_block *irqs = gic_ctx.dist.irqs + i;
> + size_t j, off = i * sizeof(irqs->isenabler);
> +
> + writel_gicd(GENMASK(31, 0), GICD_ICENABLER + off);
> + writel_gicd(irqs->isenabler, GICD_ISENABLER + off);
> +
> + writel_gicd(GENMASK(31, 0), GICD_ICACTIVER + off);
> + writel_gicd(irqs->isactiver, GICD_ISACTIVER + off);
> +
> + off = i * sizeof(irqs->ipriorityr);
> + for ( j = 0; j < ARRAY_SIZE(irqs->ipriorityr); j++ )
> + writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
apologies for spotting these only now, in case gicv2_info.nr_lines is 1020,
here and below for GICD_ITARGETSR we are going to save also IDs 1020-1023
which are reserved.
Could we assume irqs->ipriorityr and irqs->itargetsr have the same size and implement
some cap logic which might cap the last loop (eventually):
for ( i = 0; i < blocks; i++ )
{
struct irq_block *irqs = gic_ctx.dist.irqs + i;
size_t j, off = i * sizeof(irqs->isenabler);
size_t nr_regs = ARRAY_SIZE(irqs->ipriorityr);
if ( i == blocks - 1 )
nr_regs = DIV_ROUND_UP(gicv2_info.nr_lines - i * 32, 4);
[…]
off = i * sizeof(irqs->ipriorityr);
for ( j = 0; j < nr_regs; j++ )
writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
/*
* GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
* they are read-only on multiprocessor implementations and RAZ/WI
* on uniprocessor implementations.
*/
if ( i )
{
off = i * sizeof(irqs->itargetsr);
for ( j = 0; j < nr_regs; j++ )
writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
}
[…]
}
> +
> + /*
> + * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
> + * they are read-only on multiprocessor implementations and RAZ/WI
> + * on uniprocessor implementations.
> + */
> + if ( i )
> + {
> + off = i * sizeof(irqs->itargetsr);
> + for ( j = 0; j < ARRAY_SIZE(irqs->itargetsr); j++ )
> + writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
> + }
> +
> + off = i * sizeof(irqs->icfgr);
> + for ( j = 0; j < ARRAY_SIZE(irqs->icfgr); j++ )
> + writel_gicd(irqs->icfgr[j], GICD_ICFGR + off + j * 4);
in the GICv2 specs the usage constraints
of GICD_ICFGR says: “Before changing the value of a programmable Int_config field,
software must disable the corresponding interrupt, otherwise GIC behavior is
UNPREDICTABLE"
ARM IHI 0048B.b, 4.3.13.
I think we should move this restore just after GICD_ICENABLER write, before writing
GICD_ISENABLER.
And also the section says that the GICD_ICFGR0 is read-only.
Let me know your thoughts on that.
> + }
> +
> + /* Make sure all registers are restored and enable distributor */
> + writel_gicd(gic_ctx.dist.ctlr, GICD_CTLR);
> +
> + /* Restore GIC CPU interface configuration */
> + writel_gicc(gic_ctx.cpu.pmr, GICC_PMR);
> + writel_gicc(gic_ctx.cpu.bpr, GICC_BPR);
> +
> + /* Enable GIC CPU interface */
> + writel_gicc(gic_ctx.cpu.ctlr, GICC_CTLR);
> +}
> +
> +static void __init gicv2_alloc_context(void)
> +{
> + uint32_t blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
> +
> + gic_ctx.dist.irqs = xzalloc_array(struct irq_block, blocks);
> + if ( !gic_ctx.dist.irqs )
> + panic("Failed to allocate memory for GICv2 suspend context\n");
> +}
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
> #ifdef CONFIG_ACPI
> static unsigned long gicv2_get_hwdom_extra_madt_size(const struct domain *d)
> {
> @@ -1312,6 +1484,11 @@ static int __init gicv2_init(void)
>
> spin_unlock(&gicv2.lock);
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + /* Allocate memory to be used for saving GIC context during the suspend */
> + gicv2_alloc_context();
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
> return 0;
> }
>
> @@ -1355,6 +1532,10 @@ static const struct gic_hw_operations gicv2_ops = {
> .map_hwdom_extra_mappings = gicv2_map_hwdom_extra_mappings,
> .iomem_deny_access = gicv2_iomem_deny_access,
> .do_LPI = gicv2_do_LPI,
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + .suspend = gicv2_suspend,
> + .resume = gicv2_resume,
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> };
>
I don’t have any other findings for this patch.
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions
2026-05-13 14:08 ` Luca Fancellu
@ 2026-05-15 7:59 ` Mykola Kvach
2026-05-15 9:52 ` Luca Fancellu
0 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-15 7:59 UTC (permalink / raw)
To: Luca Fancellu
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Luca,
Thank you for the detailed review.
On Wed, May 13, 2026 at 5:09 PM Luca Fancellu <Luca.Fancellu@arm.com> wrote:
>
> Hi Mykola,
>
> > +
> > +static void gicv2_resume(void)
> > +{
> > + unsigned int i, blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
> > +
> > + gicv2_cpu_disable();
> > + /* Disable distributor */
> > + writel_gicd(0, GICD_CTLR);
> > +
> > + for ( i = 0; i < blocks; i++ )
> > + {
> > + struct irq_block *irqs = gic_ctx.dist.irqs + i;
> > + size_t j, off = i * sizeof(irqs->isenabler);
> > +
> > + writel_gicd(GENMASK(31, 0), GICD_ICENABLER + off);
> > + writel_gicd(irqs->isenabler, GICD_ISENABLER + off);
> > +
> > + writel_gicd(GENMASK(31, 0), GICD_ICACTIVER + off);
> > + writel_gicd(irqs->isactiver, GICD_ISACTIVER + off);
> > +
> > + off = i * sizeof(irqs->ipriorityr);
> > + for ( j = 0; j < ARRAY_SIZE(irqs->ipriorityr); j++ )
> > + writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
>
> apologies for spotting these only now, in case gicv2_info.nr_lines is 1020,
> here and below for GICD_ITARGETSR we are going to save also IDs 1020-1023
> which are reserved.
>
> Could we assume irqs->ipriorityr and irqs->itargetsr have the same size and implement
> some cap logic which might cap the last loop (eventually):
>
> for ( i = 0; i < blocks; i++ )
> {
> struct irq_block *irqs = gic_ctx.dist.irqs + i;
> size_t j, off = i * sizeof(irqs->isenabler);
> size_t nr_regs = ARRAY_SIZE(irqs->ipriorityr);
>
> if ( i == blocks - 1 )
> nr_regs = DIV_ROUND_UP(gicv2_info.nr_lines - i * 32, 4);
>
> […]
>
> off = i * sizeof(irqs->ipriorityr);
> for ( j = 0; j < nr_regs; j++ )
> writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
>
> /*
> * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
> * they are read-only on multiprocessor implementations and RAZ/WI
> * on uniprocessor implementations.
> */
> if ( i )
> {
> off = i * sizeof(irqs->itargetsr);
> for ( j = 0; j < nr_regs; j++ )
> writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
> }
>
> […]
> }
This was intentional to keep the logic simpler.
For the 1020-interrupt case, the extra word would correspond to
interrupt IDs 1020-1023. My reading of ARM IHI 0048B.b is that this
is architecturally harmless: section 4.1.2 says that reserved
Distributor register addresses are RAZ/WI, and Table 4-1 marks
GICD_IPRIORITYR offset 0x7fc and GICD_ITARGETSR offset 0xbfc as
Reserved.
Would you be OK with keeping this as-is, or would you prefer me to add
the cap logic anyway?
>
> > +
> > + /*
> > + * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
> > + * they are read-only on multiprocessor implementations and RAZ/WI
> > + * on uniprocessor implementations.
> > + */
> > + if ( i )
> > + {
> > + off = i * sizeof(irqs->itargetsr);
> > + for ( j = 0; j < ARRAY_SIZE(irqs->itargetsr); j++ )
> > + writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
> > + }
> > +
> > + off = i * sizeof(irqs->icfgr);
> > + for ( j = 0; j < ARRAY_SIZE(irqs->icfgr); j++ )
> > + writel_gicd(irqs->icfgr[j], GICD_ICFGR + off + j * 4);
>
> in the GICv2 specs the usage constraints
> of GICD_ICFGR says: “Before changing the value of a programmable Int_config field,
> software must disable the corresponding interrupt, otherwise GIC behavior is
> UNPREDICTABLE"
>
> ARM IHI 0048B.b, 4.3.13.
>
> I think we should move this restore just after GICD_ICENABLER write, before writing
> GICD_ISENABLER.
Good catch, I agree.
I will move the GICD_ICFGR restore after the GICD_ICENABLER writes
and before restoring GICD_ISENABLER, so that programmable Int_config
fields are restored while the corresponding interrupts are disabled.
I will also check whether it makes sense to move the other
configuration restores before GICD_ISENABLER as well. The spec does
not seem to impose the same strict requirement there, but keeping all
configuration restores before re-enabling the interrupts might make
the ordering clearer.
>
> And also the section says that the GICD_ICFGR0 is read-only.
For GICD_ICFGR0, my intention was to keep the restore loop uniform.
There should be no useful SGI state to restore here: section 4.3.13
says that SGI Int_config[1] is not programmable and RAO/WI, while
Int_config[0] is reserved. Also, the value written is the value
previously read from the same register.
So I do not expect this write to affect the architected SGI
configuration. However, if you prefer avoiding the write to
GICD_ICFGR0 explicitly, I will skip it in the next version of
this series.
Best regards,
Mykola
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions
2026-05-15 7:59 ` Mykola Kvach
@ 2026-05-15 9:52 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-15 9:52 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
> On 15 May 2026, at 08:59, Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> Hi Luca,
>
> Thank you for the detailed review.
>
> On Wed, May 13, 2026 at 5:09 PM Luca Fancellu <Luca.Fancellu@arm.com> wrote:
>>
>> Hi Mykola,
>>
>>> +
>>> +static void gicv2_resume(void)
>>> +{
>>> + unsigned int i, blocks = DIV_ROUND_UP(gicv2_info.nr_lines, 32);
>>> +
>>> + gicv2_cpu_disable();
>>> + /* Disable distributor */
>>> + writel_gicd(0, GICD_CTLR);
>>> +
>>> + for ( i = 0; i < blocks; i++ )
>>> + {
>>> + struct irq_block *irqs = gic_ctx.dist.irqs + i;
>>> + size_t j, off = i * sizeof(irqs->isenabler);
>>> +
>>> + writel_gicd(GENMASK(31, 0), GICD_ICENABLER + off);
>>> + writel_gicd(irqs->isenabler, GICD_ISENABLER + off);
>>> +
>>> + writel_gicd(GENMASK(31, 0), GICD_ICACTIVER + off);
>>> + writel_gicd(irqs->isactiver, GICD_ISACTIVER + off);
>>> +
>>> + off = i * sizeof(irqs->ipriorityr);
>>> + for ( j = 0; j < ARRAY_SIZE(irqs->ipriorityr); j++ )
>>> + writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
>>
>> apologies for spotting these only now, in case gicv2_info.nr_lines is 1020,
>> here and below for GICD_ITARGETSR we are going to save also IDs 1020-1023
>> which are reserved.
>>
>> Could we assume irqs->ipriorityr and irqs->itargetsr have the same size and implement
>> some cap logic which might cap the last loop (eventually):
>>
>> for ( i = 0; i < blocks; i++ )
>> {
>> struct irq_block *irqs = gic_ctx.dist.irqs + i;
>> size_t j, off = i * sizeof(irqs->isenabler);
>> size_t nr_regs = ARRAY_SIZE(irqs->ipriorityr);
>>
>> if ( i == blocks - 1 )
>> nr_regs = DIV_ROUND_UP(gicv2_info.nr_lines - i * 32, 4);
>>
>> […]
>>
>> off = i * sizeof(irqs->ipriorityr);
>> for ( j = 0; j < nr_regs; j++ )
>> writel_gicd(irqs->ipriorityr[j], GICD_IPRIORITYR + off + j * 4);
>>
>> /*
>> * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
>> * they are read-only on multiprocessor implementations and RAZ/WI
>> * on uniprocessor implementations.
>> */
>> if ( i )
>> {
>> off = i * sizeof(irqs->itargetsr);
>> for ( j = 0; j < nr_regs; j++ )
>> writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
>> }
>>
>> […]
>> }
>
> This was intentional to keep the logic simpler.
>
> For the 1020-interrupt case, the extra word would correspond to
> interrupt IDs 1020-1023. My reading of ARM IHI 0048B.b is that this
> is architecturally harmless: section 4.1.2 says that reserved
> Distributor register addresses are RAZ/WI, and Table 4-1 marks
> GICD_IPRIORITYR offset 0x7fc and GICD_ITARGETSR offset 0xbfc as
> Reserved.
>
> Would you be OK with keeping this as-is, or would you prefer me to add
> the cap logic anyway?
I think this would be the only part in the driver that does that, also Linux is avoiding
to touch these reserved parts
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/irqchip/irq-gic.c?h=v7.1-rc3#n582)
My preference would be to be consistent.
>
>>
>>> +
>>> + /*
>>> + * GICD_ITARGETSR0..7 cover SGIs/PPIs and hold no state to save:
>>> + * they are read-only on multiprocessor implementations and RAZ/WI
>>> + * on uniprocessor implementations.
>>> + */
>>> + if ( i )
>>> + {
>>> + off = i * sizeof(irqs->itargetsr);
>>> + for ( j = 0; j < ARRAY_SIZE(irqs->itargetsr); j++ )
>>> + writel_gicd(irqs->itargetsr[j], GICD_ITARGETSR + off + j * 4);
>>> + }
>>> +
>>> + off = i * sizeof(irqs->icfgr);
>>> + for ( j = 0; j < ARRAY_SIZE(irqs->icfgr); j++ )
>>> + writel_gicd(irqs->icfgr[j], GICD_ICFGR + off + j * 4);
>>
>> in the GICv2 specs the usage constraints
>> of GICD_ICFGR says: “Before changing the value of a programmable Int_config field,
>> software must disable the corresponding interrupt, otherwise GIC behavior is
>> UNPREDICTABLE"
>>
>> ARM IHI 0048B.b, 4.3.13.
>>
>> I think we should move this restore just after GICD_ICENABLER write, before writing
>> GICD_ISENABLER.
>
> Good catch, I agree.
>
> I will move the GICD_ICFGR restore after the GICD_ICENABLER writes
> and before restoring GICD_ISENABLER, so that programmable Int_config
> fields are restored while the corresponding interrupts are disabled.
>
> I will also check whether it makes sense to move the other
> configuration restores before GICD_ISENABLER as well. The spec does
> not seem to impose the same strict requirement there, but keeping all
> configuration restores before re-enabling the interrupts might make
> the ordering clearer.
>
>>
>> And also the section says that the GICD_ICFGR0 is read-only.
>
> For GICD_ICFGR0, my intention was to keep the restore loop uniform.
> There should be no useful SGI state to restore here: section 4.3.13
> says that SGI Int_config[1] is not programmable and RAO/WI, while
> Int_config[0] is reserved. Also, the value written is the value
> previously read from the same register.
>
> So I do not expect this write to affect the architected SGI
> configuration. However, if you prefer avoiding the write to
> GICD_ICFGR0 explicitly, I will skip it in the next version of
> this series.
I think also Linux “restores” it, so I’m ok to keep the code as it is, in fact it’s read-only
and not marked as reserved, my bad!
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 03/13] xen/arm: gic-v3: tolerate retained redistributor LPI state across CPU_OFF
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 01/13] xen/arm: Add suspend and resume timer helpers Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-13 14:51 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 04/13] xen/arm: gic-v3: Implement GICv3 suspend/resume functions Mykola Kvach
` (9 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Mykola Kvach <mykola_kvach@epam.com>
PSCI does not guarantee that a GICv3 redistributor is powered down across
CPU_OFF -> CPU_ON.
DEN0022F.b says CPU_OFF powers down the calling core (5.5) and CPU_ON
brings the core back with a defined initial CPU state (5.6, 6.4).
However, PSCI leaves interrupt migration and GIC re-initialization to the
supervisory software/firmware stack: the caller must migrate interrupts
away before CPU_OFF (5.5.2), and the execution context that is lost in a
powerdown state must be saved and restored by software (6.8). PSCI also
calls out GIC management explicitly in 6.8, including retargeting SPIs,
preventing PPIs/SGIs from targeting a powered down CPU, and reinitializing
the CPU interface after CPU_ON.
This matches the GIC architecture. IHI0069H.b Chapter 11.1 requires the PE
and CPU interface to share a power domain, but explicitly allows the
associated redistributor, distributor, and ITS to remain powered while the
PE and CPU interface are off. All other GIC power-management behavior is
IMPLEMENTATION DEFINED. DEN0050D Chapter 4.2, "Generic Interrupt
Controller (GIC)", says the GICv3 redistributor may live either in the AP
core power domain or in a relatively always-on parent domain. So after
CPU_OFF -> CPU_ON a secondary CPU can legitimately come back to a live
redistributor with GICR_CTLR.EnableLPIs still set.
Handle that case in the LPI setup path instead of assuming a fully reset
redistributor.
The LPI path needs special care because the GIC spec makes redistributor
LPI state sticky and partially implementation defined. IHI0069H.b 5.1.1
and 5.1.2 say that changing GICR_PROPBASER or GICR_PENDBASER while
GICR_CTLR.EnableLPIs == 1 is UNPREDICTABLE. After clearing EnableLPIs,
software must wait for GICR_CTLR.RWP == 0 before touching the pending
table. The architecture also permits implementations where, once
EnableLPIs has been set, clearing it again is not guaranteed to work.
Where an ITS is present, the spec strongly recommends moving LPIs to
another redistributor before clearing EnableLPIs.
Because of that, treat a retained EnableLPIs state as valid when the
redistributor still points at Xen's expected PROPBASER/PENDBASER tables.
Only try to clear EnableLPIs when the retained configuration does not
match Xen's state, and wait for RWP before reprogramming the tables.
This is also consistent with platform firmware reality: PSCI and the GIC
architecture allow platform-specific redistributor power handling, and not
all platform firmware implementations force a full redistributor power-off
through implementation-defined controls during CPU_OFF. Xen therefore needs
to tolerate retained redistributor state on secondary CPU bring-up.
Keep gicv3_populate_rdist() resident as well, because gicv3_cpu_init()
reuses it on secondary CPU bring-up after init.
Tested using Xen's non-boot CPU disable/enable path on Arm
FVP_Base_RevC-2xAEMvA, both with and without:
-C gic_distributor.allow-LPIEN-clear=1
-C gic_distributor.GICR-clear-enable-supported=1
and on Orange Pi 5.
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v9:
- move gicv3_do_wait_for_rwp prototype from its related header to gic.h
- drop __init from gicv3_populate_rdist(), which is reused on secondary
CPU bring-up after boot
- changed print format for smp_processor_id in gicv3_populate_rdist func
- cosmetic changes
---
xen/arch/arm/gic-v3-lpi.c | 77 +++++++++++++++++++++++++++++++++-
xen/arch/arm/gic-v3.c | 23 ++++++----
xen/arch/arm/include/asm/gic.h | 4 ++
3 files changed, 94 insertions(+), 10 deletions(-)
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 9ee338edc2..36bb45738e 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -81,6 +81,13 @@ static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
#define MAX_NR_HOST_LPIS (lpi_data.max_host_lpi_ids - LPI_OFFSET)
#define HOST_LPIS_PER_PAGE (PAGE_SIZE / sizeof(union host_lpi))
+#define GICR_PROPBASER_XEN_MASK GENMASK_ULL(51, 12)
+/*
+ * For retained redistributor state, match the pending table by address only.
+ * Attribute bits such as PTZ may not read back with the programmed value.
+ */
+#define GICR_PENDBASER_XEN_MASK GENMASK_ULL(51, 16)
+
static union host_lpi *gic_get_host_lpi(uint32_t plpi)
{
union host_lpi *block;
@@ -296,6 +303,60 @@ static int gicv3_lpi_set_pendtable(void __iomem *rdist_base)
return 0;
}
+static uint64_t gicv3_lpi_expected_proptable(void)
+{
+ return virt_to_maddr(lpi_data.lpi_property);
+}
+
+static uint64_t gicv3_lpi_expected_pendtable(void)
+{
+ return virt_to_maddr(this_cpu(lpi_redist).pending_table);
+}
+
+static bool gicv3_lpi_tables_match(void __iomem *rdist_base)
+{
+ uint64_t propbase, pendbase;
+
+ if ( !lpi_data.lpi_property || !this_cpu(lpi_redist).pending_table )
+ return false;
+
+ propbase = readq_relaxed(rdist_base + GICR_PROPBASER);
+ pendbase = readq_relaxed(rdist_base + GICR_PENDBASER);
+
+ return ((propbase & GICR_PROPBASER_XEN_MASK) ==
+ (gicv3_lpi_expected_proptable() & GICR_PROPBASER_XEN_MASK)) &&
+ ((pendbase & GICR_PENDBASER_XEN_MASK) ==
+ (gicv3_lpi_expected_pendtable() & GICR_PENDBASER_XEN_MASK));
+}
+
+static int gicv3_lpi_disable_lpis(void __iomem *rdist_base)
+{
+ uint32_t reg = readl_relaxed(rdist_base + GICR_CTLR);
+ int ret;
+
+ if ( !(reg & GICR_CTLR_ENABLE_LPIS) )
+ return 0;
+
+ writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
+
+ /*
+ * The spec only guarantees programmability when we have observed the bit
+ * cleared. Where clearing is supported, RWP must reach 0 before touching
+ * PROPBASER/PENDBASER again.
+ */
+ wmb();
+
+ ret = gicv3_do_wait_for_rwp(rdist_base);
+ if ( ret )
+ return ret;
+
+ reg = readl_relaxed(rdist_base + GICR_CTLR);
+ if ( reg & GICR_CTLR_ENABLE_LPIS )
+ return -EBUSY;
+
+ return 0;
+}
+
/*
* Tell a redistributor about the (shared) property table, allocating one
* if not already done.
@@ -374,7 +435,21 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base)
/* Make sure LPIs are disabled before setting up the tables. */
reg = readl_relaxed(rdist_base + GICR_CTLR);
if ( reg & GICR_CTLR_ENABLE_LPIS )
- return -EBUSY;
+ {
+ if ( gicv3_lpi_tables_match(rdist_base) )
+ return -EBUSY;
+
+ ret = gicv3_lpi_disable_lpis(rdist_base);
+ if ( ret == -EBUSY )
+ {
+ printk(XENLOG_ERR
+ "GICv3: CPU%u: LPIs still enabled with unexpected redistributor tables\n",
+ smp_processor_id());
+ return -EINVAL;
+ }
+ if ( ret )
+ return ret;
+ }
ret = gicv3_lpi_set_pendtable(rdist_base);
if ( ret )
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 7f365cdbe9..cc0569a157 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -274,8 +274,8 @@ static void gicv3_enable_sre(void)
isb();
}
-/* Wait for completion of a distributor change */
-static void gicv3_do_wait_for_rwp(void __iomem *base)
+/* Wait for completion of a distributor/redistributor write-pending change. */
+int gicv3_do_wait_for_rwp(void __iomem *base)
{
uint32_t val;
bool timeout = false;
@@ -295,17 +295,22 @@ static void gicv3_do_wait_for_rwp(void __iomem *base)
} while ( 1 );
if ( timeout )
+ {
dprintk(XENLOG_ERR, "RWP timeout\n");
+ return -ETIMEDOUT;
+ }
+
+ return 0;
}
static void gicv3_dist_wait_for_rwp(void)
{
- gicv3_do_wait_for_rwp(GICD);
+ (void)gicv3_do_wait_for_rwp(GICD);
}
static void gicv3_redist_wait_for_rwp(void)
{
- gicv3_do_wait_for_rwp(GICD_RDIST_BASE);
+ (void)gicv3_do_wait_for_rwp(GICD_RDIST_BASE);
}
static void gicv3_wait_for_rwp(int irq)
@@ -857,7 +862,7 @@ static bool gicv3_enable_lpis(void)
return true;
}
-static int __init gicv3_populate_rdist(void)
+static int gicv3_populate_rdist(void)
{
int i;
uint32_t aff;
@@ -925,15 +930,15 @@ static int __init gicv3_populate_rdist(void)
gicv3_set_redist_address(rdist_addr, procnum);
ret = gicv3_lpi_init_rdist(ptr);
- if ( ret && ret != -ENODEV )
+ if ( ret && ret != -ENODEV && ret != -EBUSY )
{
- printk("GICv3: CPU%d: Cannot initialize LPIs: %u\n",
+ printk("GICv3: CPU%u: Cannot initialize LPIs: %d\n",
smp_processor_id(), ret);
break;
}
}
- printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
+ printk("GICv3: CPU%u: Found redistributor in region %d @%p\n",
smp_processor_id(), i, ptr);
return 0;
}
@@ -953,7 +958,7 @@ static int __init gicv3_populate_rdist(void)
} while ( !(typer & GICR_TYPER_LAST) );
}
- dprintk(XENLOG_ERR, "GICv3: CPU%d: mpidr 0x%"PRIregister" has no re-distributor!\n",
+ dprintk(XENLOG_ERR, "GICv3: CPU%u: mpidr 0x%"PRIregister" has no re-distributor!\n",
smp_processor_id(), cpu_logical_map(smp_processor_id()));
return -ENODEV;
diff --git a/xen/arch/arm/include/asm/gic.h b/xen/arch/arm/include/asm/gic.h
index fbf0d69edd..9293aa56e2 100644
--- a/xen/arch/arm/include/asm/gic.h
+++ b/xen/arch/arm/include/asm/gic.h
@@ -301,6 +301,10 @@ extern int gicv_setup(struct domain *d);
extern void gic_save_state(struct vcpu *v);
extern void gic_restore_state(struct vcpu *v);
+#ifdef CONFIG_GICV3
+int gicv3_do_wait_for_rwp(void __iomem *base);
+#endif
+
#ifdef CONFIG_SYSTEM_SUSPEND
/* Suspend/resume */
extern int gic_suspend(void);
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 03/13] xen/arm: gic-v3: tolerate retained redistributor LPI state across CPU_OFF
2026-05-12 17:07 ` [PATCH v9 03/13] xen/arm: gic-v3: tolerate retained redistributor LPI state across CPU_OFF Mykola Kvach
@ 2026-05-13 14:51 ` Luca Fancellu
2026-05-14 8:41 ` Mykola Kvach
0 siblings, 1 reply; 29+ messages in thread
From: Luca Fancellu @ 2026-05-13 14:51 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
> On 12 May 2026, at 18:07, Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> PSCI does not guarantee that a GICv3 redistributor is powered down across
> CPU_OFF -> CPU_ON.
>
> DEN0022F.b says CPU_OFF powers down the calling core (5.5) and CPU_ON
> brings the core back with a defined initial CPU state (5.6, 6.4).
> However, PSCI leaves interrupt migration and GIC re-initialization to the
> supervisory software/firmware stack: the caller must migrate interrupts
> away before CPU_OFF (5.5.2), and the execution context that is lost in a
> powerdown state must be saved and restored by software (6.8). PSCI also
> calls out GIC management explicitly in 6.8, including retargeting SPIs,
> preventing PPIs/SGIs from targeting a powered down CPU, and reinitializing
> the CPU interface after CPU_ON.
>
> This matches the GIC architecture. IHI0069H.b Chapter 11.1 requires the PE
> and CPU interface to share a power domain, but explicitly allows the
> associated redistributor, distributor, and ITS to remain powered while the
> PE and CPU interface are off. All other GIC power-management behavior is
> IMPLEMENTATION DEFINED. DEN0050D Chapter 4.2, "Generic Interrupt
> Controller (GIC)", says the GICv3 redistributor may live either in the AP
> core power domain or in a relatively always-on parent domain. So after
> CPU_OFF -> CPU_ON a secondary CPU can legitimately come back to a live
> redistributor with GICR_CTLR.EnableLPIs still set.
>
> Handle that case in the LPI setup path instead of assuming a fully reset
> redistributor.
>
> The LPI path needs special care because the GIC spec makes redistributor
> LPI state sticky and partially implementation defined. IHI0069H.b 5.1.1
> and 5.1.2 say that changing GICR_PROPBASER or GICR_PENDBASER while
> GICR_CTLR.EnableLPIs == 1 is UNPREDICTABLE. After clearing EnableLPIs,
> software must wait for GICR_CTLR.RWP == 0 before touching the pending
> table. The architecture also permits implementations where, once
> EnableLPIs has been set, clearing it again is not guaranteed to work.
> Where an ITS is present, the spec strongly recommends moving LPIs to
> another redistributor before clearing EnableLPIs.
>
> Because of that, treat a retained EnableLPIs state as valid when the
> redistributor still points at Xen's expected PROPBASER/PENDBASER tables.
> Only try to clear EnableLPIs when the retained configuration does not
> match Xen's state, and wait for RWP before reprogramming the tables.
>
> This is also consistent with platform firmware reality: PSCI and the GIC
> architecture allow platform-specific redistributor power handling, and not
> all platform firmware implementations force a full redistributor power-off
> through implementation-defined controls during CPU_OFF. Xen therefore needs
> to tolerate retained redistributor state on secondary CPU bring-up.
>
> Keep gicv3_populate_rdist() resident as well, because gicv3_cpu_init()
> reuses it on secondary CPU bring-up after init.
>
> Tested using Xen's non-boot CPU disable/enable path on Arm
> FVP_Base_RevC-2xAEMvA, both with and without:
> -C gic_distributor.allow-LPIEN-clear=1
> -C gic_distributor.GICR-clear-enable-supported=1
> and on Orange Pi 5.
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
I understand you will send a separate patch to fix RWP.
Looks ok to me, I see you’ve touched some printk in gicv3_populate_rdist() which
were wrongly having %d for smp_processor_id(), sometimes maintainers are not
really ok with that when the changes are not related to the commit, but apart from that:
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v9 03/13] xen/arm: gic-v3: tolerate retained redistributor LPI state across CPU_OFF
2026-05-13 14:51 ` Luca Fancellu
@ 2026-05-14 8:41 ` Mykola Kvach
0 siblings, 0 replies; 29+ messages in thread
From: Mykola Kvach @ 2026-05-14 8:41 UTC (permalink / raw)
To: Luca Fancellu
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Luca,
Thank you for the review.
On Wed, May 13, 2026 at 5:53 PM Luca Fancellu <Luca.Fancellu@arm.com> wrote:
>
> Hi Mykola,
>
> > On 12 May 2026, at 18:07, Mykola Kvach <xakep.amatop@gmail.com> wrote:
> >
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > PSCI does not guarantee that a GICv3 redistributor is powered down across
> > CPU_OFF -> CPU_ON.
> >
> > DEN0022F.b says CPU_OFF powers down the calling core (5.5) and CPU_ON
> > brings the core back with a defined initial CPU state (5.6, 6.4).
> > However, PSCI leaves interrupt migration and GIC re-initialization to the
> > supervisory software/firmware stack: the caller must migrate interrupts
> > away before CPU_OFF (5.5.2), and the execution context that is lost in a
> > powerdown state must be saved and restored by software (6.8). PSCI also
> > calls out GIC management explicitly in 6.8, including retargeting SPIs,
> > preventing PPIs/SGIs from targeting a powered down CPU, and reinitializing
> > the CPU interface after CPU_ON.
> >
> > This matches the GIC architecture. IHI0069H.b Chapter 11.1 requires the PE
> > and CPU interface to share a power domain, but explicitly allows the
> > associated redistributor, distributor, and ITS to remain powered while the
> > PE and CPU interface are off. All other GIC power-management behavior is
> > IMPLEMENTATION DEFINED. DEN0050D Chapter 4.2, "Generic Interrupt
> > Controller (GIC)", says the GICv3 redistributor may live either in the AP
> > core power domain or in a relatively always-on parent domain. So after
> > CPU_OFF -> CPU_ON a secondary CPU can legitimately come back to a live
> > redistributor with GICR_CTLR.EnableLPIs still set.
> >
> > Handle that case in the LPI setup path instead of assuming a fully reset
> > redistributor.
> >
> > The LPI path needs special care because the GIC spec makes redistributor
> > LPI state sticky and partially implementation defined. IHI0069H.b 5.1.1
> > and 5.1.2 say that changing GICR_PROPBASER or GICR_PENDBASER while
> > GICR_CTLR.EnableLPIs == 1 is UNPREDICTABLE. After clearing EnableLPIs,
> > software must wait for GICR_CTLR.RWP == 0 before touching the pending
> > table. The architecture also permits implementations where, once
> > EnableLPIs has been set, clearing it again is not guaranteed to work.
> > Where an ITS is present, the spec strongly recommends moving LPIs to
> > another redistributor before clearing EnableLPIs.
> >
> > Because of that, treat a retained EnableLPIs state as valid when the
> > redistributor still points at Xen's expected PROPBASER/PENDBASER tables.
> > Only try to clear EnableLPIs when the retained configuration does not
> > match Xen's state, and wait for RWP before reprogramming the tables.
> >
> > This is also consistent with platform firmware reality: PSCI and the GIC
> > architecture allow platform-specific redistributor power handling, and not
> > all platform firmware implementations force a full redistributor power-off
> > through implementation-defined controls during CPU_OFF. Xen therefore needs
> > to tolerate retained redistributor state on secondary CPU bring-up.
> >
> > Keep gicv3_populate_rdist() resident as well, because gicv3_cpu_init()
> > reuses it on secondary CPU bring-up after init.
> >
> > Tested using Xen's non-boot CPU disable/enable path on Arm
> > FVP_Base_RevC-2xAEMvA, both with and without:
> > -C gic_distributor.allow-LPIEN-clear=1
> > -C gic_distributor.GICR-clear-enable-supported=1
> > and on Orange Pi 5.
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
>
> I understand you will send a separate patch to fix RWP.
>
> Looks ok to me, I see you’ve touched some printk in gicv3_populate_rdist() which
> were wrongly having %d for smp_processor_id(), sometimes maintainers are not
> really ok with that when the changes are not related to the commit, but apart from that:
That change was intentional.
In the previous version of this series, you pointed out the
wrong format specifier for smp_processor_id() in one of these
printk()s.
Although it was not directly related to the functional change in
this patch, I decided to fix the other similar occurrences in
gicv3_populate_rdist() as well, for consistency.
>
> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
>
> Cheers,
> Luca
>
>
Best regards,
Mykola
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 04/13] xen/arm: gic-v3: Implement GICv3 suspend/resume functions
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (2 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 03/13] xen/arm: gic-v3: tolerate retained redistributor LPI state across CPU_OFF Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-13 16:11 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 05/13] xen/arm: gic-v3: add ITS suspend/resume support Mykola Kvach
` (8 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Mykola Kvach <mykola_kvach@epam.com>
System suspend may lead to a state where GIC would be powered down.
Therefore, Xen should save/restore the context of GIC on suspend/resume.
Note that the context consists of states of registers which are
controlled by the hypervisor. Other GIC registers which are accessible
by guests are saved/restored on context switch.
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V9:
- fix the suspend-context comment typo and split dist_ctx declarations;
- restore ICC_IGRPEN1_EL1 on the suspend error path;
- re-initialize GICD_IGROUPRnE during resume;
- restore GICD_IROUTER only after re-enabling ARE_NS during resume.
Changes in V8:
- use right rdist base for prop/pend baser and ctrl
Changes in V7:
- restore LPI regs on resume
- add timeout during redist disabling
- squash with suspend/resume handling for GICv3 eSPI registers
- drop ITS guard paths so suspend/resume always runs; switch missing ctx
allocation to panic
- trim TODO comments; narrow redistributor storage to PPI icfgr
- keep distributor context allocation even without ITS; adjust resume
to use GENMASK(31, 0) for clearing enables
- drop storage of the SGI configuration register, as SGIs are always
edge-triggered
---
xen/arch/arm/gic-v3-lpi.c | 3 +
xen/arch/arm/gic-v3.c | 362 ++++++++++++++++++++++++-
xen/arch/arm/include/asm/gic_v3_defs.h | 1 +
3 files changed, 363 insertions(+), 3 deletions(-)
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 36bb45738e..b0289ee4dc 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -467,6 +467,9 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
switch ( action )
{
case CPU_UP_PREPARE:
+ if ( system_state == SYS_STATE_resume )
+ break;
+
rc = gicv3_lpi_allocate_pendtable(cpu);
if ( rc )
printk(XENLOG_ERR "Unable to allocate the pendtable for CPU%lu\n",
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index cc0569a157..6fc4354e2e 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1072,12 +1072,12 @@ out:
return res;
}
-static void gicv3_hyp_disable(void)
+static void gicv3_hyp_enable(bool enable)
{
register_t hcr;
hcr = READ_SYSREG(ICH_HCR_EL2);
- hcr &= ~GICH_HCR_EN;
+ hcr = enable ? (hcr | GICH_HCR_EN) : (hcr & ~GICH_HCR_EN);
WRITE_SYSREG(hcr, ICH_HCR_EL2);
isb();
}
@@ -1184,7 +1184,7 @@ static void gicv3_disable_interface(void)
spin_lock(&gicv3.lock);
gicv3_cpu_disable();
- gicv3_hyp_disable();
+ gicv3_hyp_enable(false);
spin_unlock(&gicv3.lock);
}
@@ -1920,6 +1920,354 @@ static bool gic_dist_supports_lpis(void)
return (readl_relaxed(GICD + GICD_TYPER) & GICD_TYPE_LPIS);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+/* This struct represents a block of 32 IRQs */
+struct dist_irq_block {
+ uint32_t icfgr[2];
+ uint32_t ipriorityr[8];
+ uint64_t irouter[32];
+ uint32_t isactiver;
+ uint32_t isenabler;
+};
+
+struct redist_ctx {
+ uint32_t ctlr;
+ uint32_t icfgr; /* only PPIs stored */
+ uint32_t igroupr;
+ uint32_t ipriorityr[8];
+ uint32_t isactiver;
+ uint32_t isenabler;
+
+ uint64_t pendbase;
+ uint64_t propbase;
+};
+
+/* GICv3 registers to be saved/restored on system suspend/resume */
+struct gicv3_ctx {
+ struct dist_ctx {
+ uint32_t ctlr;
+ struct dist_irq_block *irqs;
+ struct dist_irq_block *espi_irqs;
+ } dist;
+
+ /* have only one rdist structure for last running CPU during suspend */
+ struct redist_ctx rdist;
+
+ struct cpu_ctx {
+ uint32_t ctlr;
+ uint32_t pmr;
+ uint32_t bpr;
+ uint32_t sre_el2;
+ uint32_t grpen;
+ } cpu;
+};
+
+static struct gicv3_ctx gicv3_ctx;
+
+static void __init gicv3_alloc_context(void)
+{
+ uint32_t blocks = DIV_ROUND_UP(gicv3_info.nr_lines, 32);
+
+ /* The spec allows for systems without any SPIs */
+ if ( blocks > 1 )
+ {
+ gicv3_ctx.dist.irqs = xzalloc_array(struct dist_irq_block, blocks - 1);
+ if ( !gicv3_ctx.dist.irqs )
+ panic("Failed to allocate memory for GICv3 suspend context\n");
+ }
+
+#ifdef CONFIG_GICV3_ESPI
+ if ( !gic_number_espis() )
+ return;
+
+ blocks = gic_number_espis() / 32;
+ gicv3_ctx.dist.espi_irqs = xzalloc_array(struct dist_irq_block, blocks);
+ if ( !gicv3_ctx.dist.espi_irqs )
+ panic("Failed to allocate memory for GICv3 eSPI suspend context\n");
+#endif
+}
+
+static int gicv3_disable_redist(void)
+{
+ void __iomem *waker = GICD_RDIST_BASE + GICR_WAKER;
+ s_time_t deadline;
+
+ /*
+ * Avoid infinite loop if Non-secure does not have access to GICR_WAKER.
+ * See Arm IHI 0069H.b, 12.11.42 GICR_WAKER:
+ * When GICD_CTLR.DS == 0 and an access is Non-secure accesses to this
+ * register are RAZ/WI.
+ */
+ if ( !(readl_relaxed(GICD + GICD_CTLR) & GICD_CTLR_DS) )
+ return 0;
+
+ deadline = NOW() + MILLISECS(1000);
+
+ writel_relaxed(readl_relaxed(waker) | GICR_WAKER_ProcessorSleep, waker);
+ while ( (readl_relaxed(waker) & GICR_WAKER_ChildrenAsleep) == 0 )
+ {
+ if ( NOW() > deadline )
+ {
+ printk("GICv3: Timeout waiting for redistributor to sleep\n");
+ return -ETIMEDOUT;
+ }
+ cpu_relax();
+ udelay(10);
+ }
+
+ return 0;
+}
+
+#define GET_SPI_REG_OFFSET(name, is_espi) \
+ ((is_espi) ? GICD_##name##nE : GICD_##name)
+
+static void gicv3_store_spi_irq_block(struct dist_irq_block *irqs,
+ unsigned int i, bool is_espi)
+{
+ void __iomem *base;
+ unsigned int irq;
+
+ base = GICD + GET_SPI_REG_OFFSET(ICFGR, is_espi) + i * sizeof(irqs->icfgr);
+ irqs->icfgr[0] = readl_relaxed(base);
+ irqs->icfgr[1] = readl_relaxed(base + 4);
+
+ base = GICD + GET_SPI_REG_OFFSET(IPRIORITYR, is_espi);
+ base += i * sizeof(irqs->ipriorityr);
+ for ( irq = 0; irq < ARRAY_SIZE(irqs->ipriorityr); irq++ )
+ irqs->ipriorityr[irq] = readl_relaxed(base + 4 * irq);
+
+ base = GICD + GET_SPI_REG_OFFSET(IROUTER, is_espi);
+ base += i * sizeof(irqs->irouter);
+ for ( irq = 0; irq < ARRAY_SIZE(irqs->irouter); irq++ )
+ irqs->irouter[irq] = readq_relaxed_non_atomic(base + 8 * irq);
+
+ base = GICD + GET_SPI_REG_OFFSET(ISACTIVER, is_espi);
+ base += i * sizeof(irqs->isactiver);
+ irqs->isactiver = readl_relaxed(base);
+
+ base = GICD + GET_SPI_REG_OFFSET(ISENABLER, is_espi);
+ base += i * sizeof(irqs->isenabler);
+ irqs->isenabler = readl_relaxed(base);
+}
+
+static void gicv3_restore_spi_irq_config(struct dist_irq_block *irqs,
+ unsigned int i, bool is_espi)
+{
+ void __iomem *base;
+ unsigned int irq;
+
+ base = GICD + GET_SPI_REG_OFFSET(ICFGR, is_espi) + i * sizeof(irqs->icfgr);
+ writel_relaxed(irqs->icfgr[0], base);
+ writel_relaxed(irqs->icfgr[1], base + 4);
+
+ base = GICD + GET_SPI_REG_OFFSET(IPRIORITYR, is_espi);
+ base += i * sizeof(irqs->ipriorityr);
+ for ( irq = 0; irq < ARRAY_SIZE(irqs->ipriorityr); irq++ )
+ writel_relaxed(irqs->ipriorityr[irq], base + 4 * irq);
+}
+
+static void gicv3_restore_spi_irq_routing(struct dist_irq_block *irqs,
+ unsigned int i, bool is_espi)
+{
+ void __iomem *base;
+ unsigned int irq;
+
+ base = GICD + GET_SPI_REG_OFFSET(IROUTER, is_espi);
+ base += i * sizeof(irqs->irouter);
+ for ( irq = 0; irq < ARRAY_SIZE(irqs->irouter); irq++ )
+ writeq_relaxed_non_atomic(irqs->irouter[irq], base + 8 * irq);
+}
+
+static void gicv3_restore_spi_irq_state(struct dist_irq_block *irqs,
+ unsigned int i, bool is_espi)
+{
+ void __iomem *base;
+
+ base = GICD + GET_SPI_REG_OFFSET(ICENABLER, is_espi) + i * 4;
+ writel_relaxed(GENMASK(31, 0), base);
+
+ base = GICD + GET_SPI_REG_OFFSET(ISENABLER, is_espi);
+ base += i * sizeof(irqs->isenabler);
+ writel_relaxed(irqs->isenabler, base);
+
+ base = GICD + GET_SPI_REG_OFFSET(ICACTIVER, is_espi) + i * 4;
+ writel_relaxed(GENMASK(31, 0), base);
+
+ base = GICD + GET_SPI_REG_OFFSET(ISACTIVER, is_espi);
+ base += i * sizeof(irqs->isactiver);
+ writel_relaxed(irqs->isactiver, base);
+}
+
+static int gicv3_suspend(void)
+{
+ unsigned int i;
+ void __iomem *base;
+ int ret;
+ struct redist_ctx *rdist = &gicv3_ctx.rdist;
+
+ /* Save GICC configuration */
+ gicv3_ctx.cpu.ctlr = READ_SYSREG(ICC_CTLR_EL1);
+ gicv3_ctx.cpu.pmr = READ_SYSREG(ICC_PMR_EL1);
+ gicv3_ctx.cpu.bpr = READ_SYSREG(ICC_BPR1_EL1);
+ gicv3_ctx.cpu.sre_el2 = READ_SYSREG(ICC_SRE_EL2);
+ gicv3_ctx.cpu.grpen = READ_SYSREG(ICC_IGRPEN1_EL1);
+
+ gicv3_disable_interface();
+
+ ret = gicv3_disable_redist();
+ if ( ret )
+ goto out_enable_iface;
+
+ /* Save GICR configuration */
+ gicv3_redist_wait_for_rwp();
+
+ base = GICD_RDIST_BASE;
+
+ rdist->ctlr = readl_relaxed(base + GICR_CTLR);
+
+ rdist->propbase = readq_relaxed(base + GICR_PROPBASER);
+ rdist->pendbase = readq_relaxed(base + GICR_PENDBASER);
+
+ base = GICD_RDIST_SGI_BASE;
+
+ /* Save priority on PPI and SGI interrupts */
+ for ( i = 0; i < NR_GIC_LOCAL_IRQS / 4; i++ )
+ rdist->ipriorityr[i] = readl_relaxed(base + GICR_IPRIORITYR0 + 4 * i);
+
+ rdist->isactiver = readl_relaxed(base + GICR_ISACTIVER0);
+ rdist->isenabler = readl_relaxed(base + GICR_ISENABLER0);
+ rdist->igroupr = readl_relaxed(base + GICR_IGROUPR0);
+ rdist->icfgr = readl_relaxed(base + GICR_ICFGR1);
+
+ /* Save GICD configuration */
+ gicv3_dist_wait_for_rwp();
+ gicv3_ctx.dist.ctlr = readl_relaxed(GICD + GICD_CTLR);
+
+ for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
+ gicv3_store_spi_irq_block(gicv3_ctx.dist.irqs + i - 1, i, false);
+
+#ifdef CONFIG_GICV3_ESPI
+ for ( i = 0; i < gic_number_espis() / 32; i++ )
+ gicv3_store_spi_irq_block(gicv3_ctx.dist.espi_irqs + i, i, true);
+#endif
+
+ return 0;
+
+ out_enable_iface:
+ gicv3_hyp_enable(true);
+ WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
+ isb();
+
+ return ret;
+}
+
+static void gicv3_resume(void)
+{
+ int ret;
+ unsigned int i;
+ uint32_t dist_ctlr;
+ void __iomem *base;
+ struct redist_ctx *rdist = &gicv3_ctx.rdist;
+
+ writel_relaxed(0, GICD + GICD_CTLR);
+
+ for ( i = NR_GIC_LOCAL_IRQS; i < gicv3_info.nr_lines; i += 32 )
+ writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPR + (i / 32) * 4);
+
+ for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
+ gicv3_restore_spi_irq_config(gicv3_ctx.dist.irqs + i - 1, i, false);
+
+#ifdef CONFIG_GICV3_ESPI
+ for ( i = 0; i < gic_number_espis() / 32; i++ )
+ {
+ writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPRnE + i * 4);
+ gicv3_restore_spi_irq_config(gicv3_ctx.dist.espi_irqs + i, i, true);
+ }
+#endif
+
+ dist_ctlr = gicv3_ctx.dist.ctlr & GICD_CTLR_ARE_NS;
+ if ( dist_ctlr )
+ {
+ writel_relaxed(dist_ctlr, GICD + GICD_CTLR);
+ gicv3_dist_wait_for_rwp();
+
+ for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
+ gicv3_restore_spi_irq_routing(gicv3_ctx.dist.irqs + i - 1, i,
+ false);
+
+#ifdef CONFIG_GICV3_ESPI
+ for ( i = 0; i < gic_number_espis() / 32; i++ )
+ gicv3_restore_spi_irq_routing(gicv3_ctx.dist.espi_irqs + i, i,
+ true);
+#endif
+ }
+
+ for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
+ gicv3_restore_spi_irq_state(gicv3_ctx.dist.irqs + i - 1, i, false);
+
+#ifdef CONFIG_GICV3_ESPI
+ for ( i = 0; i < gic_number_espis() / 32; i++ )
+ gicv3_restore_spi_irq_state(gicv3_ctx.dist.espi_irqs + i, i, true);
+#endif
+
+ writel_relaxed(gicv3_ctx.dist.ctlr, GICD + GICD_CTLR);
+ gicv3_dist_wait_for_rwp();
+
+ ret = gicv3_lpi_init_rdist(GICD_RDIST_BASE);
+ /*
+ * If LPIs are already enabled, assume firmware or the still-powered
+ * redistributor has valid PROPBASER/PENDBASER and skip reprogramming.
+ * Return -EBUSY so callers can ignore this case.
+ */
+ if ( ret && ret != -ENODEV && ret != -EBUSY )
+ panic("GICv3: Failed to re-initialize LPIs during resume\n");
+ else if ( ret == -EBUSY ) /* extra checks, just to be sure */
+ {
+ base = GICD_RDIST_BASE;
+ if ( readq_relaxed(base + GICR_PROPBASER) != rdist->propbase ||
+ readq_relaxed(base + GICR_PENDBASER) != rdist->pendbase )
+ panic("GICv3: LPIs already enabled with unexpected PROPBASER/PENDBASER during resume\n");
+ }
+
+ /* Restore GICR (Redistributor) configuration */
+ if ( gicv3_enable_redist() )
+ panic("GICv3: Failed to re-enable redistributor during resume\n");
+
+ base = GICD_RDIST_SGI_BASE;
+
+ writel_relaxed(GENMASK(31, 0), base + GICR_ICENABLER0);
+ gicv3_redist_wait_for_rwp();
+
+ for ( i = 0; i < NR_GIC_LOCAL_IRQS / 4; i++ )
+ writel_relaxed(rdist->ipriorityr[i], base + GICR_IPRIORITYR0 + i * 4);
+
+ writel_relaxed(rdist->isactiver, base + GICR_ISACTIVER0);
+ writel_relaxed(rdist->igroupr, base + GICR_IGROUPR0);
+ writel_relaxed(rdist->icfgr, base + GICR_ICFGR1);
+
+ gicv3_redist_wait_for_rwp();
+
+ writel_relaxed(rdist->isenabler, base + GICR_ISENABLER0);
+ writel_relaxed(rdist->ctlr, GICD_RDIST_BASE + GICR_CTLR);
+
+ gicv3_redist_wait_for_rwp();
+
+ WRITE_SYSREG(gicv3_ctx.cpu.sre_el2, ICC_SRE_EL2);
+ isb();
+
+ /* Restore CPU interface (System registers) */
+ WRITE_SYSREG(gicv3_ctx.cpu.pmr, ICC_PMR_EL1);
+ WRITE_SYSREG(gicv3_ctx.cpu.bpr, ICC_BPR1_EL1);
+ WRITE_SYSREG(gicv3_ctx.cpu.ctlr, ICC_CTLR_EL1);
+ WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
+ isb();
+
+ gicv3_hyp_init();
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/* Set up the GIC */
static int __init gicv3_init(void)
{
@@ -1994,6 +2342,10 @@ static int __init gicv3_init(void)
gicv3_hyp_init();
+#ifdef CONFIG_SYSTEM_SUSPEND
+ gicv3_alloc_context();
+#endif
+
out:
spin_unlock(&gicv3.lock);
@@ -2033,6 +2385,10 @@ static const struct gic_hw_operations gicv3_ops = {
#endif
.iomem_deny_access = gicv3_iomem_deny_access,
.do_LPI = gicv3_do_LPI,
+#ifdef CONFIG_SYSTEM_SUSPEND
+ .suspend = gicv3_suspend,
+ .resume = gicv3_resume,
+#endif
};
static int __init gicv3_dt_preinit(struct dt_device_node *node, const void *data)
diff --git a/xen/arch/arm/include/asm/gic_v3_defs.h b/xen/arch/arm/include/asm/gic_v3_defs.h
index c373b94d19..992c8f9c2f 100644
--- a/xen/arch/arm/include/asm/gic_v3_defs.h
+++ b/xen/arch/arm/include/asm/gic_v3_defs.h
@@ -94,6 +94,7 @@
#define GICD_TYPE_LPIS (1U << 17)
#define GICD_CTLR_RWP (1UL << 31)
+#define GICD_CTLR_DS (1U << 6)
#define GICD_CTLR_ARE_NS (1U << 4)
#define GICD_CTLR_ENABLE_G1A (1U << 1)
#define GICD_CTLR_ENABLE_G1 (1U << 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 04/13] xen/arm: gic-v3: Implement GICv3 suspend/resume functions
2026-05-12 17:07 ` [PATCH v9 04/13] xen/arm: gic-v3: Implement GICv3 suspend/resume functions Mykola Kvach
@ 2026-05-13 16:11 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-13 16:11 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
> +}
> +
> +static int gicv3_disable_redist(void)
> +{
> + void __iomem *waker = GICD_RDIST_BASE + GICR_WAKER;
> + s_time_t deadline;
> +
> + /*
> + * Avoid infinite loop if Non-secure does not have access to GICR_WAKER.
> + * See Arm IHI 0069H.b, 12.11.42 GICR_WAKER:
> + * When GICD_CTLR.DS == 0 and an access is Non-secure accesses to this
> + * register are RAZ/WI.
> + */
> + if ( !(readl_relaxed(GICD + GICD_CTLR) & GICD_CTLR_DS) )
> + return 0;
> +
> + deadline = NOW() + MILLISECS(1000);
> +
> + writel_relaxed(readl_relaxed(waker) | GICR_WAKER_ProcessorSleep, waker);
> + while ( (readl_relaxed(waker) & GICR_WAKER_ChildrenAsleep) == 0 )
> + {
> + if ( NOW() > deadline )
> + {
> + printk("GICv3: Timeout waiting for redistributor to sleep\n");
I think here we should clear GICR_WAKER_ProcessorSleep, the Arm IHI 0069H.b, section
11.1 says that
“”"
When GICR_WAKER.ProcessorSleep == 1 or GICR_WAKER.ChildrenAsleep == 1 then a write to any GICC_*,
GICV_*, GICH_*, ICC_*, ICV_*, or ICH_* registers, other than those in the following list, is unpredictable:
• ICC_SRE_EL1.
• ICC_SRE_EL2.
• ICC_SRE_EL3
“""
But in the error path used in gicv3_suspend() we are writing ICH_HCR_EL2 and ICC_IGRPEN1_EL1.
> + return -ETIMEDOUT;
> + }
> + cpu_relax();
> + udelay(10);
> + }
> +
> + return 0;
> +}
> +
> +#define GET_SPI_REG_OFFSET(name, is_espi) \
> + ((is_espi) ? GICD_##name##nE : GICD_##name)
> +
> +static void gicv3_store_spi_irq_block(struct dist_irq_block *irqs,
> + unsigned int i, bool is_espi)
> +{
> + void __iomem *base;
> + unsigned int irq;
> +
> + base = GICD + GET_SPI_REG_OFFSET(ICFGR, is_espi) + i * sizeof(irqs->icfgr);
> + irqs->icfgr[0] = readl_relaxed(base);
> + irqs->icfgr[1] = readl_relaxed(base + 4);
> +
> + base = GICD + GET_SPI_REG_OFFSET(IPRIORITYR, is_espi);
> + base += i * sizeof(irqs->ipriorityr);
> + for ( irq = 0; irq < ARRAY_SIZE(irqs->ipriorityr); irq++ )
> + irqs->ipriorityr[irq] = readl_relaxed(base + 4 * irq);
> +
> + base = GICD + GET_SPI_REG_OFFSET(IROUTER, is_espi);
> + base += i * sizeof(irqs->irouter);
> + for ( irq = 0; irq < ARRAY_SIZE(irqs->irouter); irq++ )
> + irqs->irouter[irq] = readq_relaxed_non_atomic(base + 8 * irq);
> +
> + base = GICD + GET_SPI_REG_OFFSET(ISACTIVER, is_espi);
> + base += i * sizeof(irqs->isactiver);
> + irqs->isactiver = readl_relaxed(base);
> +
> + base = GICD + GET_SPI_REG_OFFSET(ISENABLER, is_espi);
> + base += i * sizeof(irqs->isenabler);
> + irqs->isenabler = readl_relaxed(base);
> +}
> +
> +static void gicv3_restore_spi_irq_config(struct dist_irq_block *irqs,
> + unsigned int i, bool is_espi)
> +{
> + void __iomem *base;
> + unsigned int irq;
> +
> + base = GICD + GET_SPI_REG_OFFSET(ICFGR, is_espi) + i * sizeof(irqs->icfgr);
> + writel_relaxed(irqs->icfgr[0], base);
> + writel_relaxed(irqs->icfgr[1], base + 4);
> +
> + base = GICD + GET_SPI_REG_OFFSET(IPRIORITYR, is_espi);
> + base += i * sizeof(irqs->ipriorityr);
> + for ( irq = 0; irq < ARRAY_SIZE(irqs->ipriorityr); irq++ )
> + writel_relaxed(irqs->ipriorityr[irq], base + 4 * irq);
> +}
> +
> +static void gicv3_restore_spi_irq_routing(struct dist_irq_block *irqs,
> + unsigned int i, bool is_espi)
> +{
> + void __iomem *base;
> + unsigned int irq;
> +
> + base = GICD + GET_SPI_REG_OFFSET(IROUTER, is_espi);
> + base += i * sizeof(irqs->irouter);
> + for ( irq = 0; irq < ARRAY_SIZE(irqs->irouter); irq++ )
> + writeq_relaxed_non_atomic(irqs->irouter[irq], base + 8 * irq);
> +}
> +
> +static void gicv3_restore_spi_irq_state(struct dist_irq_block *irqs,
> + unsigned int i, bool is_espi)
> +{
> + void __iomem *base;
> +
> + base = GICD + GET_SPI_REG_OFFSET(ICENABLER, is_espi) + i * 4;
> + writel_relaxed(GENMASK(31, 0), base);
> +
> + base = GICD + GET_SPI_REG_OFFSET(ISENABLER, is_espi);
> + base += i * sizeof(irqs->isenabler);
> + writel_relaxed(irqs->isenabler, base);
> +
> + base = GICD + GET_SPI_REG_OFFSET(ICACTIVER, is_espi) + i * 4;
> + writel_relaxed(GENMASK(31, 0), base);
> +
> + base = GICD + GET_SPI_REG_OFFSET(ISACTIVER, is_espi);
> + base += i * sizeof(irqs->isactiver);
> + writel_relaxed(irqs->isactiver, base);
> +}
> +
> +static int gicv3_suspend(void)
> +{
> + unsigned int i;
> + void __iomem *base;
> + int ret;
> + struct redist_ctx *rdist = &gicv3_ctx.rdist;
> +
> + /* Save GICC configuration */
> + gicv3_ctx.cpu.ctlr = READ_SYSREG(ICC_CTLR_EL1);
> + gicv3_ctx.cpu.pmr = READ_SYSREG(ICC_PMR_EL1);
> + gicv3_ctx.cpu.bpr = READ_SYSREG(ICC_BPR1_EL1);
> + gicv3_ctx.cpu.sre_el2 = READ_SYSREG(ICC_SRE_EL2);
> + gicv3_ctx.cpu.grpen = READ_SYSREG(ICC_IGRPEN1_EL1);
> +
> + gicv3_disable_interface();
Should we check that ICC_AP1R<n>_EL1 == 0 before continuing our
suspend? Like we do in the GICv2?
> +
> + ret = gicv3_disable_redist();
> + if ( ret )
> + goto out_enable_iface;
> +
> + /* Save GICR configuration */
> + gicv3_redist_wait_for_rwp();
> +
> + base = GICD_RDIST_BASE;
> +
> + rdist->ctlr = readl_relaxed(base + GICR_CTLR);
> +
> + rdist->propbase = readq_relaxed(base + GICR_PROPBASER);
> + rdist->pendbase = readq_relaxed(base + GICR_PENDBASER);
> +
> + base = GICD_RDIST_SGI_BASE;
> +
> + /* Save priority on PPI and SGI interrupts */
> + for ( i = 0; i < NR_GIC_LOCAL_IRQS / 4; i++ )
> + rdist->ipriorityr[i] = readl_relaxed(base + GICR_IPRIORITYR0 + 4 * i);
> +
> + rdist->isactiver = readl_relaxed(base + GICR_ISACTIVER0);
> + rdist->isenabler = readl_relaxed(base + GICR_ISENABLER0);
> + rdist->igroupr = readl_relaxed(base + GICR_IGROUPR0);
> + rdist->icfgr = readl_relaxed(base + GICR_ICFGR1);
> +
> + /* Save GICD configuration */
> + gicv3_dist_wait_for_rwp();
> + gicv3_ctx.dist.ctlr = readl_relaxed(GICD + GICD_CTLR);
> +
> + for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> + gicv3_store_spi_irq_block(gicv3_ctx.dist.irqs + i - 1, i, false);
> +
> +#ifdef CONFIG_GICV3_ESPI
> + for ( i = 0; i < gic_number_espis() / 32; i++ )
> + gicv3_store_spi_irq_block(gicv3_ctx.dist.espi_irqs + i, i, true);
> +#endif
> +
> + return 0;
> +
> + out_enable_iface:
> + gicv3_hyp_enable(true);
> + WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
> + isb();
> +
> + return ret;
> +}
> +
> +static void gicv3_resume(void)
> +{
> + int ret;
> + unsigned int i;
> + uint32_t dist_ctlr;
> + void __iomem *base;
> + struct redist_ctx *rdist = &gicv3_ctx.rdist;
> +
> + writel_relaxed(0, GICD + GICD_CTLR);
> +
> + for ( i = NR_GIC_LOCAL_IRQS; i < gicv3_info.nr_lines; i += 32 )
> + writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPR + (i / 32) * 4);
> +
> + for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> + gicv3_restore_spi_irq_config(gicv3_ctx.dist.irqs + i - 1, i, false);
> +
> +#ifdef CONFIG_GICV3_ESPI
> + for ( i = 0; i < gic_number_espis() / 32; i++ )
> + {
> + writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPRnE + i * 4);
> + gicv3_restore_spi_irq_config(gicv3_ctx.dist.espi_irqs + i, i, true);
> + }
> +#endif
> +
> + dist_ctlr = gicv3_ctx.dist.ctlr & GICD_CTLR_ARE_NS;
> + if ( dist_ctlr )
> + {
> + writel_relaxed(dist_ctlr, GICD + GICD_CTLR);
> + gicv3_dist_wait_for_rwp();
> +
> + for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> + gicv3_restore_spi_irq_routing(gicv3_ctx.dist.irqs + i - 1, i,
> + false);
I think we have an issue in this loop as we are accessing GICD_IROUTER<1020…1023>
and GICD_IPRIORITYR255, while the specs says 1020…1023 are reserved and
GICD_IPRIORITYR<n> goes from 0 to 254. here and in the gicv3_suspend()
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 05/13] xen/arm: gic-v3: add ITS suspend/resume support
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (3 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 04/13] xen/arm: gic-v3: Implement GICv3 suspend/resume functions Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-14 14:45 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 06/13] xen/arm: tee: keep init_tee_secondary() for hotplug and resume Mykola Kvach
` (7 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
From: Mykola Kvach <mykola_kvach@epam.com>
Handle system suspend/resume for GICv3 with an ITS present so LPIs keep
working after firmware powers the GIC down. Save and restore the ITS
CTLR, CBASER and BASER registers, and re-establish the collection mapping
on resume.
Add list_for_each_entry_continue_reverse() in list.h for the ITS suspend
error path that needs to roll back partially saved state.
Based on Linux commit dba0bc7b76dc:
"irqchip/gic-v3-its: Add ability to save/restore ITS state".
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V9:
- fix the ITS suspend/resume coding-style nits;
- preserve the saved GITS_CTLR state while masking the read-only
QUIESCENT bit.
Changes in V8:
- Reword the CBASER/CWRITER comment to match Xen and drop the stale Linux
cmd_write reference.
- Clarify the list_for_each_entry_continue_reverse() comment.
- Factor out per-ITS helpers for collection setup and resume.
- Restore each ITS and re-establish its collection mapping in the same
loop, so a failed ITS resume is not followed by MAPC/SYNC on that
un-restored instance.
- panic in case when resume of an ITS failed
- cleanup baser cache during suspend
---
xen/arch/arm/gic-v3-its.c | 133 ++++++++++++++++++++++++--
xen/arch/arm/gic-v3.c | 11 ++-
xen/arch/arm/include/asm/gic_v3_its.h | 23 +++++
xen/include/xen/list.h | 14 +++
4 files changed, 171 insertions(+), 10 deletions(-)
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 9005ce8ce5..582c26d964 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -335,6 +335,22 @@ static int its_send_cmd_inv(struct host_its *its,
return its_send_command(its, cmd);
}
+static int gicv3_its_setup_collection_single(struct host_its *its,
+ unsigned int cpu)
+{
+ int ret;
+
+ ret = its_send_cmd_mapc(its, cpu, cpu);
+ if ( ret )
+ return ret;
+
+ ret = its_send_cmd_sync(its, cpu);
+ if ( ret )
+ return ret;
+
+ return gicv3_its_wait_commands(its);
+}
+
/* Set up the (1:1) collection mapping for the given host CPU. */
int gicv3_its_setup_collection(unsigned int cpu)
{
@@ -343,15 +359,7 @@ int gicv3_its_setup_collection(unsigned int cpu)
list_for_each_entry(its, &host_its_list, entry)
{
- ret = its_send_cmd_mapc(its, cpu, cpu);
- if ( ret )
- return ret;
-
- ret = its_send_cmd_sync(its, cpu);
- if ( ret )
- return ret;
-
- ret = gicv3_its_wait_commands(its);
+ ret = gicv3_its_setup_collection_single(its, cpu);
if ( ret )
return ret;
}
@@ -1210,6 +1218,113 @@ int gicv3_its_init(void)
return 0;
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+int gicv3_its_suspend(void)
+{
+ struct host_its *its;
+ int ret;
+
+ list_for_each_entry( its, &host_its_list, entry )
+ {
+ unsigned int i;
+ void __iomem *base = its->its_base;
+
+ /*
+ * By the time Xen reaches gic_suspend(), every domain is already in
+ * SHUTDOWN_suspend, so ITS-targeting interrupt sources are expected
+ * to have been quiesced by the owning OS before SYSTEM_SUSPEND.
+ */
+ /* Preserve saved GITS_CTLR state, excluding read-only QUIESCENT. */
+ its->suspend_ctx.ctlr = readl_relaxed(base + GITS_CTLR) &
+ ~GITS_CTLR_QUIESCENT;
+ ret = gicv3_disable_its(its);
+ if ( ret )
+ {
+ writel_relaxed(its->suspend_ctx.ctlr, base + GITS_CTLR);
+ goto err;
+ }
+
+ its->suspend_ctx.cbaser = readq_relaxed(base + GITS_CBASER);
+
+ for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
+ {
+ uint64_t baser = readq_relaxed(base + GITS_BASER0 + i * 8);
+
+ its->suspend_ctx.baser[i] = 0;
+
+ if ( !(baser & GITS_VALID_BIT) )
+ continue;
+
+ its->suspend_ctx.baser[i] = baser;
+ }
+ }
+
+ return 0;
+
+ err:
+ list_for_each_entry_continue_reverse( its, &host_its_list, entry )
+ writel_relaxed(its->suspend_ctx.ctlr, its->its_base + GITS_CTLR);
+
+ return ret;
+}
+
+static int gicv3_its_resume_single(struct host_its *its, unsigned int cpu)
+{
+ void __iomem *base = its->its_base;
+ unsigned int i;
+ int ret;
+
+ /*
+ * Make sure that the ITS is disabled. If it fails to quiesce,
+ * don't restore it since writing to CBASER or BASER<n>
+ * registers is undefined according to the GIC v3 ITS
+ * Specification.
+ */
+ WARN_ON(readl_relaxed(base + GITS_CTLR) & GITS_CTLR_ENABLE);
+ ret = gicv3_disable_its(its);
+ if ( ret )
+ return ret;
+
+ writeq_relaxed(its->suspend_ctx.cbaser, base + GITS_CBASER);
+
+ /*
+ * Writing CBASER resets CREADR to 0, so reset CWRITER to
+ * keep the command queue pointers aligned.
+ */
+ writeq_relaxed(0, base + GITS_CWRITER);
+
+ /* Restore GITS_BASER from the value cache. */
+ for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
+ {
+ uint64_t baser = its->suspend_ctx.baser[i];
+
+ if ( !(baser & GITS_VALID_BIT) )
+ continue;
+
+ writeq_relaxed(baser, base + GITS_BASER0 + i * 8);
+ }
+
+ writel_relaxed(its->suspend_ctx.ctlr, base + GITS_CTLR);
+
+ return gicv3_its_setup_collection_single(its, cpu);
+}
+
+void gicv3_its_resume(void)
+{
+ struct host_its *its;
+ unsigned int cpu = smp_processor_id();
+ int ret;
+
+ list_for_each_entry( its, &host_its_list, entry )
+ {
+ ret = gicv3_its_resume_single(its, cpu);
+ if ( ret )
+ panic("GICv3: ITS@%"PRIpaddr": failed to restore during resume: %d\n",
+ its->addr, ret);
+ }
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
/*
* Local variables:
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 6fc4354e2e..8ef9f416d0 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -2115,10 +2115,14 @@ static int gicv3_suspend(void)
gicv3_disable_interface();
- ret = gicv3_disable_redist();
+ ret = gicv3_its_suspend();
if ( ret )
goto out_enable_iface;
+ ret = gicv3_disable_redist();
+ if ( ret )
+ goto out_its_resume;
+
/* Save GICR configuration */
gicv3_redist_wait_for_rwp();
@@ -2154,6 +2158,9 @@ static int gicv3_suspend(void)
return 0;
+ out_its_resume:
+ gicv3_its_resume();
+
out_enable_iface:
gicv3_hyp_enable(true);
WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
@@ -2253,6 +2260,8 @@ static void gicv3_resume(void)
gicv3_redist_wait_for_rwp();
+ gicv3_its_resume();
+
WRITE_SYSREG(gicv3_ctx.cpu.sre_el2, ICC_SRE_EL2);
isb();
diff --git a/xen/arch/arm/include/asm/gic_v3_its.h b/xen/arch/arm/include/asm/gic_v3_its.h
index fc5a84892c..492c468ae0 100644
--- a/xen/arch/arm/include/asm/gic_v3_its.h
+++ b/xen/arch/arm/include/asm/gic_v3_its.h
@@ -129,6 +129,13 @@ struct host_its {
spinlock_t cmd_lock;
void *cmd_buf;
unsigned int flags;
+#ifdef CONFIG_SYSTEM_SUSPEND
+ struct suspend_ctx {
+ uint32_t ctlr;
+ uint64_t cbaser;
+ uint64_t baser[GITS_BASER_NR_REGS];
+ } suspend_ctx;
+#endif
};
/* Map a collection for this host CPU to each host ITS. */
@@ -204,6 +211,11 @@ uint64_t gicv3_its_get_cacheability(void);
uint64_t gicv3_its_get_shareability(void);
unsigned int gicv3_its_get_memflags(void);
+#ifdef CONFIG_SYSTEM_SUSPEND
+int gicv3_its_suspend(void);
+void gicv3_its_resume(void);
+#endif
+
#else
#ifdef CONFIG_ACPI
@@ -271,6 +283,17 @@ static inline int gicv3_its_make_hwdom_dt_nodes(const struct domain *d,
return 0;
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+static inline int gicv3_its_suspend(void)
+{
+ return 0;
+}
+
+static inline void gicv3_its_resume(void)
+{
+}
+#endif
+
#endif /* CONFIG_HAS_ITS */
#endif
diff --git a/xen/include/xen/list.h b/xen/include/xen/list.h
index 98d8482dab..2aab274157 100644
--- a/xen/include/xen/list.h
+++ b/xen/include/xen/list.h
@@ -535,6 +535,20 @@ static inline void list_splice_init(struct list_head *list,
&(pos)->member != (head); \
(pos) = list_entry((pos)->member.next, typeof(*(pos)), member))
+/**
+ * list_for_each_entry_continue_reverse - iterate backwards from the given point
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_head within the struct.
+ *
+ * Iterate over list of given type backwards, starting from the element previous
+ * to the current one in list order.
+ */
+#define list_for_each_entry_continue_reverse(pos, head, member) \
+ for ((pos) = list_entry((pos)->member.prev, typeof(*(pos)), member); \
+ &(pos)->member != (head); \
+ (pos) = list_entry((pos)->member.prev, typeof(*(pos)), member))
+
/**
* list_for_each_entry_from - iterate over list of given type from the
* current point
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 05/13] xen/arm: gic-v3: add ITS suspend/resume support
2026-05-12 17:07 ` [PATCH v9 05/13] xen/arm: gic-v3: add ITS suspend/resume support Mykola Kvach
@ 2026-05-14 14:45 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-14 14:45 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné
Hi Mykola,
> On 12 May 2026, at 18:07, Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> Handle system suspend/resume for GICv3 with an ITS present so LPIs keep
> working after firmware powers the GIC down. Save and restore the ITS
> CTLR, CBASER and BASER registers, and re-establish the collection mapping
> on resume.
>
> Add list_for_each_entry_continue_reverse() in list.h for the ITS suspend
> error path that needs to roll back partially saved state.
>
> Based on Linux commit dba0bc7b76dc:
> "irqchip/gic-v3-its: Add ability to save/restore ITS state".
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V9:
> - fix the ITS suspend/resume coding-style nits;
> - preserve the saved GITS_CTLR state while masking the read-only
> QUIESCENT bit.
>
> Changes in V8:
> - Reword the CBASER/CWRITER comment to match Xen and drop the stale Linux
> cmd_write reference.
> - Clarify the list_for_each_entry_continue_reverse() comment.
> - Factor out per-ITS helpers for collection setup and resume.
> - Restore each ITS and re-establish its collection mapping in the same
> loop, so a failed ITS resume is not followed by MAPC/SYNC on that
> un-restored instance.
> - panic in case when resume of an ITS failed
> - cleanup baser cache during suspend
> ---
> xen/arch/arm/gic-v3-its.c | 133 ++++++++++++++++++++++++--
> xen/arch/arm/gic-v3.c | 11 ++-
> xen/arch/arm/include/asm/gic_v3_its.h | 23 +++++
> xen/include/xen/list.h | 14 +++
> 4 files changed, 171 insertions(+), 10 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 9005ce8ce5..582c26d964 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -335,6 +335,22 @@ static int its_send_cmd_inv(struct host_its *its,
> return its_send_command(its, cmd);
> }
>
> +static int gicv3_its_setup_collection_single(struct host_its *its,
> + unsigned int cpu)
> +{
> + int ret;
> +
> + ret = its_send_cmd_mapc(its, cpu, cpu);
> + if ( ret )
> + return ret;
> +
> + ret = its_send_cmd_sync(its, cpu);
> + if ( ret )
> + return ret;
> +
> + return gicv3_its_wait_commands(its);
> +}
> +
> /* Set up the (1:1) collection mapping for the given host CPU. */
> int gicv3_its_setup_collection(unsigned int cpu)
> {
> @@ -343,15 +359,7 @@ int gicv3_its_setup_collection(unsigned int cpu)
>
> list_for_each_entry(its, &host_its_list, entry)
> {
> - ret = its_send_cmd_mapc(its, cpu, cpu);
> - if ( ret )
> - return ret;
> -
> - ret = its_send_cmd_sync(its, cpu);
> - if ( ret )
> - return ret;
> -
> - ret = gicv3_its_wait_commands(its);
> + ret = gicv3_its_setup_collection_single(its, cpu);
> if ( ret )
> return ret;
> }
> @@ -1210,6 +1218,113 @@ int gicv3_its_init(void)
> return 0;
> }
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +int gicv3_its_suspend(void)
> +{
> + struct host_its *its;
> + int ret;
> +
> + list_for_each_entry( its, &host_its_list, entry )
> + {
> + unsigned int i;
> + void __iomem *base = its->its_base;
> +
> + /*
> + * By the time Xen reaches gic_suspend(), every domain is already in
> + * SHUTDOWN_suspend, so ITS-targeting interrupt sources are expected
> + * to have been quiesced by the owning OS before SYSTEM_SUSPEND.
> + */
> + /* Preserve saved GITS_CTLR state, excluding read-only QUIESCENT. */
> + its->suspend_ctx.ctlr = readl_relaxed(base + GITS_CTLR) &
> + ~GITS_CTLR_QUIESCENT;
> + ret = gicv3_disable_its(its);
> + if ( ret )
> + {
> + writel_relaxed(its->suspend_ctx.ctlr, base + GITS_CTLR);
> + goto err;
> + }
> +
> + its->suspend_ctx.cbaser = readq_relaxed(base + GITS_CBASER);
> +
> + for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> + {
> + uint64_t baser = readq_relaxed(base + GITS_BASER0 + i * 8);
> +
> + its->suspend_ctx.baser[i] = 0;
> +
> + if ( !(baser & GITS_VALID_BIT) )
> + continue;
> +
> + its->suspend_ctx.baser[i] = baser;
> + }
> + }
> +
> + return 0;
> +
> + err:
> + list_for_each_entry_continue_reverse( its, &host_its_list, entry )
> + writel_relaxed(its->suspend_ctx.ctlr, its->its_base + GITS_CTLR);
> +
> + return ret;
> +}
> +
> +static int gicv3_its_resume_single(struct host_its *its, unsigned int cpu)
> +{
> + void __iomem *base = its->its_base;
> + unsigned int i;
> + int ret;
> +
> + /*
> + * Make sure that the ITS is disabled. If it fails to quiesce,
> + * don't restore it since writing to CBASER or BASER<n>
> + * registers is undefined according to the GIC v3 ITS
s/undefined/unpredictable/ ?
> + * Specification.
> + */
> + WARN_ON(readl_relaxed(base + GITS_CTLR) & GITS_CTLR_ENABLE);
> + ret = gicv3_disable_its(its);
> + if ( ret )
> + return ret;
> +
> + writeq_relaxed(its->suspend_ctx.cbaser, base + GITS_CBASER);
> +
> + /*
> + * Writing CBASER resets CREADR to 0, so reset CWRITER to
> + * keep the command queue pointers aligned.
> + */
> + writeq_relaxed(0, base + GITS_CWRITER);
> +
> + /* Restore GITS_BASER from the value cache. */
> + for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> + {
> + uint64_t baser = its->suspend_ctx.baser[i];
> +
> + if ( !(baser & GITS_VALID_BIT) )
> + continue;
> +
> + writeq_relaxed(baser, base + GITS_BASER0 + i * 8);
> + }
> +
> + writel_relaxed(its->suspend_ctx.ctlr, base + GITS_CTLR);
> +
> + return gicv3_its_setup_collection_single(its, cpu);
This will always issue a MAPC V=1, in the section 5.3.9 it sais it’s "unpredictable
if there are interrupts that are mapped to the specified collection and the
collection is currently mapped to a Redistributor, unless MAPC is followed by MOVALL”,
in this case the redistributor is the same but the specs don’t say anything about this case,
it’s generally unpredictable if we are remapping an already-live collection.
I see Linux reply the MAPC V=1 only if the collection is stored in the ITS (not memory backed),
our col_id is `cpu`, which I believe that for the suspend path is always zero (?), so by looking into
HCC we could check if we need to issue the MAPC or not.
if ( cpu < GITS_TYPER_HCC(readq_relaxed(base + GITS_TYPER)) )
return gicv3_its_setup_collection_single(its, cpu);
return 0;
> +}
> +
> +void gicv3_its_resume(void)
> +{
> + struct host_its *its;
> + unsigned int cpu = smp_processor_id();
> + int ret;
> +
> + list_for_each_entry( its, &host_its_list, entry )
> + {
> + ret = gicv3_its_resume_single(its, cpu);
> + if ( ret )
> + panic("GICv3: ITS@%"PRIpaddr": failed to restore during resume: %d\n",
> + its->addr, ret);
> + }
> +}
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
>
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 06/13] xen/arm: tee: keep init_tee_secondary() for hotplug and resume
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (4 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 05/13] xen/arm: gic-v3: add ITS suspend/resume support Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 07/13] xen/arm: ffa: fix notification SRI across CPU hotplug/suspend Mykola Kvach
` (6 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Volodymyr Babchuk, Bertrand Marquis, Jens Wiklander,
Stefano Stabellini, Julien Grall, Michal Orzel, Luca Fancellu
From: Mykola Kvach <mykola_kvach@epam.com>
init_tee_secondary() was marked __init and freed after boot. Calling it
from the CPU hotplug/resume path then executed discarded code, which
could crash Xen. Drop __init so the TEE mediator secondary init can run
safely on hotplugged and resumed CPUs.
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
---
xen/arch/arm/tee/tee.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/arm/tee/tee.c b/xen/arch/arm/tee/tee.c
index 8501443c8e..00e561fc78 100644
--- a/xen/arch/arm/tee/tee.c
+++ b/xen/arch/arm/tee/tee.c
@@ -128,7 +128,7 @@ static int __init tee_init(void)
presmp_initcall(tee_init);
-void __init init_tee_secondary(void)
+void init_tee_secondary(void)
{
if ( cur_mediator && cur_mediator->ops->init_secondary )
cur_mediator->ops->init_secondary();
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v9 07/13] xen/arm: ffa: fix notification SRI across CPU hotplug/suspend
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (5 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 06/13] xen/arm: tee: keep init_tee_secondary() for hotplug and resume Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 08/13] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
` (5 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Volodymyr Babchuk, Bertrand Marquis, Jens Wiklander,
Stefano Stabellini, Julien Grall, Michal Orzel, Luca Fancellu
From: Mykola Kvach <mykola_kvach@epam.com>
The FF-A notification SRI interrupt handler was not correctly tied to
CPU hotplug and suspend/resume. As a result, CPUs going offline and
back online could end up with stale or missing handlers, breaking
delivery of FF-A notifications.
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
xen/arch/arm/tee/ffa_notif.c | 63 ++++++++++++++++++++++++++++--------
1 file changed, 50 insertions(+), 13 deletions(-)
diff --git a/xen/arch/arm/tee/ffa_notif.c b/xen/arch/arm/tee/ffa_notif.c
index 186e726412..513c399594 100644
--- a/xen/arch/arm/tee/ffa_notif.c
+++ b/xen/arch/arm/tee/ffa_notif.c
@@ -360,10 +360,28 @@ static int32_t ffa_notification_bitmap_destroy(uint16_t vm_id)
return ffa_simple_call(FFA_NOTIFICATION_BITMAP_DESTROY, vm_id, 0, 0, 0);
}
-void ffa_notif_init_interrupt(void)
+static DEFINE_PER_CPU_READ_MOSTLY(struct irqaction, sri_irq);
+
+static int request_sri_irq(void)
{
int ret;
+ struct irqaction *sri_action = &this_cpu(sri_irq);
+
+ sri_action->name = "FF-A notif";
+ sri_action->handler = notif_irq_handler;
+ sri_action->dev_id = NULL;
+ sri_action->free_on_release = 0;
+
+ ret = setup_irq(notif_sri_irq, 0, sri_action);
+ if ( ret )
+ printk(XENLOG_ERR "ffa: setup_irq irq %u failed: error %d\n",
+ notif_sri_irq, ret);
+ return ret;
+}
+
+void ffa_notif_init_interrupt(void)
+{
if ( fw_notif_enabled && notif_sri_irq < NR_GIC_SGI )
{
/*
@@ -376,14 +394,36 @@ void ffa_notif_init_interrupt(void)
* pending, while the SPMC in the secure world will not notice that
* the interrupt was lost.
*/
- ret = request_irq(notif_sri_irq, 0, notif_irq_handler, "FF-A notif",
- NULL);
- if ( ret )
- printk(XENLOG_ERR "ffa: request_irq irq %u failed: error %d\n",
- notif_sri_irq, ret);
+ request_sri_irq();
}
}
+static void deinit_ffa_notif_interrupt(void)
+{
+ if ( fw_notif_enabled && notif_sri_irq < NR_GIC_SGI )
+ release_irq(notif_sri_irq, NULL);
+}
+
+static int cpu_ffa_notif_callback(struct notifier_block *nfb,
+ unsigned long action,
+ void *hcpu)
+{
+ switch ( action )
+ {
+ case CPU_DYING:
+ deinit_ffa_notif_interrupt();
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block cpu_ffa_notif_nfb = {
+ .notifier_call = cpu_ffa_notif_callback,
+};
+
void ffa_notif_init(void)
{
const struct arm_smccc_1_2_regs arg = {
@@ -392,7 +432,6 @@ void ffa_notif_init(void)
};
struct arm_smccc_1_2_regs resp;
unsigned int irq;
- int ret;
/* Only enable fw notification if all ABIs we need are supported */
if ( ffa_fw_supports_fid(FFA_NOTIFICATION_BITMAP_CREATE) &&
@@ -408,13 +447,11 @@ void ffa_notif_init(void)
notif_sri_irq = irq;
if ( irq >= NR_GIC_SGI )
irq_set_type(irq, IRQ_TYPE_EDGE_RISING);
- ret = request_irq(irq, 0, notif_irq_handler, "FF-A notif", NULL);
- if ( ret )
- {
- printk(XENLOG_ERR "ffa: request_irq irq %u failed: error %d\n",
- irq, ret);
+
+ if ( request_sri_irq() )
return;
- }
+
+ register_cpu_notifier(&cpu_ffa_notif_nfb);
fw_notif_enabled = true;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v9 08/13] iommu/ipmmu-vmsa: Implement suspend/resume callbacks
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (6 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 07/13] xen/arm: ffa: fix notification SRI across CPU hotplug/suspend Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-14 15:57 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 09/13] xen/arm: smmu-v3: add suspend/resume handlers Mykola Kvach
` (4 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Store and restore active context and micro-TLB registers.
Tested on R-Car H3 Starter Kit.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V9:
- set dt_device_set_protected() only after ipmmu_alloc_ctx_suspend()
succeeds, so DT devices do not remain protected on allocation failure.
Changes in V7:
- moved suspend context allocation before pci stuff
---
xen/drivers/passthrough/arm/ipmmu-vmsa.c | 319 +++++++++++++++++++++--
1 file changed, 304 insertions(+), 15 deletions(-)
diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index fa9ab9cb13..e1b47a5824 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -71,6 +71,8 @@
})
#endif
+#define dev_dbg(dev, fmt, ...) \
+ dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__)
#define dev_info(dev, fmt, ...) \
dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
#define dev_warn(dev, fmt, ...) \
@@ -130,6 +132,24 @@ struct ipmmu_features {
unsigned int imuctr_ttsel_mask;
};
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+struct ipmmu_reg_ctx {
+ unsigned int imttlbr0;
+ unsigned int imttubr0;
+ unsigned int imttbcr;
+ unsigned int imctr;
+};
+
+struct ipmmu_vmsa_backup {
+ struct device *dev;
+ unsigned int *utlbs_val;
+ unsigned int *asids_val;
+ struct list_head list;
+};
+
+#endif
+
/* Root/Cache IPMMU device's information */
struct ipmmu_vmsa_device {
struct device *dev;
@@ -142,6 +162,9 @@ struct ipmmu_vmsa_device {
struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
unsigned int utlb_refcount[IPMMU_UTLB_MAX];
const struct ipmmu_features *features;
+#ifdef CONFIG_SYSTEM_SUSPEND
+ struct ipmmu_reg_ctx *reg_backup[IPMMU_CTX_MAX];
+#endif
};
/*
@@ -547,6 +570,245 @@ static void ipmmu_domain_free_context(struct ipmmu_vmsa_device *mmu,
spin_unlock_irqrestore(&mmu->lock, flags);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+static DEFINE_SPINLOCK(ipmmu_devices_backup_lock);
+static LIST_HEAD(ipmmu_devices_backup);
+
+static struct ipmmu_reg_ctx root_pgtable[IPMMU_CTX_MAX];
+
+static uint32_t ipmmu_imuasid_read(struct ipmmu_vmsa_device *mmu,
+ unsigned int utlb)
+{
+ return ipmmu_read(mmu, ipmmu_utlb_reg(mmu, IMUASID(utlb)));
+}
+
+static void ipmmu_utlbs_backup(struct ipmmu_vmsa_device *mmu)
+{
+ struct ipmmu_vmsa_backup *backup_data;
+
+ dev_dbg(mmu->dev, "Handle micro-TLBs backup\n");
+
+ spin_lock(&ipmmu_devices_backup_lock);
+
+ list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
+ {
+ struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
+ unsigned int i;
+
+ if ( to_ipmmu(backup_data->dev) != mmu )
+ continue;
+
+ for ( i = 0; i < fwspec->num_ids; i++ )
+ {
+ unsigned int utlb = fwspec->ids[i];
+
+ backup_data->asids_val[i] = ipmmu_imuasid_read(mmu, utlb);
+ backup_data->utlbs_val[i] = ipmmu_imuctr_read(mmu, utlb);
+ }
+ }
+
+ spin_unlock(&ipmmu_devices_backup_lock);
+}
+
+static void ipmmu_utlbs_restore(struct ipmmu_vmsa_device *mmu)
+{
+ struct ipmmu_vmsa_backup *backup_data;
+
+ dev_dbg(mmu->dev, "Handle micro-TLBs restore\n");
+
+ spin_lock(&ipmmu_devices_backup_lock);
+
+ list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
+ {
+ struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
+ unsigned int i;
+
+ if ( to_ipmmu(backup_data->dev) != mmu )
+ continue;
+
+ for ( i = 0; i < fwspec->num_ids; i++ )
+ {
+ unsigned int utlb = fwspec->ids[i];
+
+ ipmmu_imuasid_write(mmu, utlb, backup_data->asids_val[i]);
+ ipmmu_imuctr_write(mmu, utlb, backup_data->utlbs_val[i]);
+ }
+ }
+
+ spin_unlock(&ipmmu_devices_backup_lock);
+}
+
+static void ipmmu_domain_backup_context(struct ipmmu_vmsa_domain *domain)
+{
+ struct ipmmu_vmsa_device *mmu = domain->mmu->root;
+ struct ipmmu_reg_ctx *regs = mmu->reg_backup[domain->context_id];
+
+ dev_dbg(mmu->dev, "Handle domain context %u backup\n", domain->context_id);
+
+ regs->imttlbr0 = ipmmu_ctx_read_root(domain, IMTTLBR0);
+ regs->imttubr0 = ipmmu_ctx_read_root(domain, IMTTUBR0);
+ regs->imttbcr = ipmmu_ctx_read_root(domain, IMTTBCR);
+ regs->imctr = ipmmu_ctx_read_root(domain, IMCTR);
+}
+
+static void ipmmu_domain_restore_context(struct ipmmu_vmsa_domain *domain)
+{
+ struct ipmmu_vmsa_device *mmu = domain->mmu->root;
+ struct ipmmu_reg_ctx *regs = mmu->reg_backup[domain->context_id];
+
+ dev_dbg(mmu->dev, "Handle domain context %u restore\n", domain->context_id);
+
+ ipmmu_ctx_write_root(domain, IMTTLBR0, regs->imttlbr0);
+ ipmmu_ctx_write_root(domain, IMTTUBR0, regs->imttubr0);
+ ipmmu_ctx_write_root(domain, IMTTBCR, regs->imttbcr);
+ ipmmu_ctx_write_all(domain, IMCTR, regs->imctr | IMCTR_FLUSH);
+}
+
+/*
+ * Xen: Unlike Linux implementation, Xen uses a single driver instance
+ * for handling all IPMMUs. There is no framework for ipmmu_suspend/resume
+ * callbacks to be invoked for each IPMMU device. So, we need to iterate
+ * through all registered IPMMUs performing required actions.
+ *
+ * Also take care of restoring special settings, such as translation
+ * table format, etc.
+ */
+static int __must_check ipmmu_suspend(void)
+{
+ struct ipmmu_vmsa_device *mmu;
+
+ if ( !iommu_enabled )
+ return 0;
+
+ printk(XENLOG_DEBUG "ipmmu: Suspending...\n");
+
+ spin_lock(&ipmmu_devices_lock);
+
+ list_for_each_entry( mmu, &ipmmu_devices, list )
+ {
+ if ( ipmmu_is_root(mmu) )
+ {
+ unsigned int i;
+
+ for ( i = 0; i < mmu->num_ctx; i++ )
+ {
+ if ( !mmu->domains[i] )
+ continue;
+ ipmmu_domain_backup_context(mmu->domains[i]);
+ }
+ }
+ else
+ ipmmu_utlbs_backup(mmu);
+ }
+
+ spin_unlock(&ipmmu_devices_lock);
+
+ return 0;
+}
+
+static void ipmmu_resume(void)
+{
+ struct ipmmu_vmsa_device *mmu;
+
+ if ( !iommu_enabled )
+ return;
+
+ printk(XENLOG_DEBUG "ipmmu: Resuming...\n");
+
+ spin_lock(&ipmmu_devices_lock);
+
+ list_for_each_entry( mmu, &ipmmu_devices, list )
+ {
+ uint32_t reg;
+
+ /* Do not use security group function */
+ reg = IMSCTLR + mmu->features->control_offset_base;
+ ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) & ~IMSCTLR_USE_SECGRP);
+
+ if ( ipmmu_is_root(mmu) )
+ {
+ unsigned int i;
+
+ /* Use stage 2 translation table format */
+ reg = IMSAUXCTLR + mmu->features->control_offset_base;
+ ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) | IMSAUXCTLR_S2PTE);
+
+ for ( i = 0; i < mmu->num_ctx; i++ )
+ {
+ if ( !mmu->domains[i] )
+ continue;
+ ipmmu_domain_restore_context(mmu->domains[i]);
+ }
+ }
+ else
+ ipmmu_utlbs_restore(mmu);
+ }
+
+ spin_unlock(&ipmmu_devices_lock);
+}
+
+static int ipmmu_alloc_ctx_suspend(struct device *dev)
+{
+ struct ipmmu_vmsa_backup *backup_data;
+ unsigned int *utlbs_val, *asids_val;
+ struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+ utlbs_val = xzalloc_array(unsigned int, fwspec->num_ids);
+ if ( !utlbs_val )
+ return -ENOMEM;
+
+ asids_val = xzalloc_array(unsigned int, fwspec->num_ids);
+ if ( !asids_val )
+ {
+ xfree(utlbs_val);
+ return -ENOMEM;
+ }
+
+ backup_data = xzalloc(struct ipmmu_vmsa_backup);
+ if ( !backup_data )
+ {
+ xfree(utlbs_val);
+ xfree(asids_val);
+ return -ENOMEM;
+ }
+
+ backup_data->dev = dev;
+ backup_data->utlbs_val = utlbs_val;
+ backup_data->asids_val = asids_val;
+
+ spin_lock(&ipmmu_devices_backup_lock);
+ list_add(&backup_data->list, &ipmmu_devices_backup);
+ spin_unlock(&ipmmu_devices_backup_lock);
+
+ return 0;
+}
+
+#ifdef CONFIG_HAS_PCI
+static void ipmmu_free_ctx_suspend(struct device *dev)
+{
+ struct ipmmu_vmsa_backup *backup_data, *tmp;
+
+ spin_lock(&ipmmu_devices_backup_lock);
+
+ list_for_each_entry_safe( backup_data, tmp, &ipmmu_devices_backup, list )
+ {
+ if ( backup_data->dev == dev )
+ {
+ list_del(&backup_data->list);
+ xfree(backup_data->utlbs_val);
+ xfree(backup_data->asids_val);
+ xfree(backup_data);
+ break;
+ }
+ }
+
+ spin_unlock(&ipmmu_devices_backup_lock);
+}
+#endif /* CONFIG_HAS_PCI */
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
{
uint64_t ttbr;
@@ -559,6 +821,9 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
return ret;
domain->context_id = ret;
+#ifdef CONFIG_SYSTEM_SUSPEND
+ domain->mmu->root->reg_backup[ret] = &root_pgtable[ret];
+#endif
/*
* TTBR0
@@ -615,6 +880,9 @@ static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
ipmmu_ctx_write_root(domain, IMCTR, IMCTR_FLUSH);
ipmmu_tlb_sync(domain);
+#ifdef CONFIG_SYSTEM_SUSPEND
+ domain->mmu->root->reg_backup[domain->context_id] = NULL;
+#endif
ipmmu_domain_free_context(domain->mmu->root, domain->context_id);
}
@@ -1338,10 +1606,11 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
struct iommu_fwspec *fwspec;
#ifdef CONFIG_HAS_PCI
+ int ret;
+
if ( dev_is_pci(dev) )
{
struct pci_dev *pdev = dev_to_pci(dev);
- int ret;
if ( devfn != pdev->devfn )
return 0;
@@ -1358,17 +1627,24 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
if ( !to_ipmmu(dev) )
return -ENODEV;
- if ( !dev_is_pci(dev) )
+ if ( !dev_is_pci(dev) && dt_device_is_protected(dev_to_dt(dev)) )
{
- if ( dt_device_is_protected(dev_to_dt(dev)) )
- {
- dev_err(dev, "Already added to IPMMU\n");
- return -EEXIST;
- }
+ dev_err(dev, "Already added to IPMMU\n");
+ return -EEXIST;
+ }
- /* Let Xen know that the master device is protected by an IOMMU. */
- dt_device_set_protected(dev_to_dt(dev));
+#ifdef CONFIG_SYSTEM_SUSPEND
+ if ( ipmmu_alloc_ctx_suspend(dev) )
+ {
+ dev_err(dev, "Failed to allocate context for suspend\n");
+ return -ENOMEM;
}
+#endif
+
+ /* Let Xen know that the master device is protected by an IOMMU. */
+ if ( !dev_is_pci(dev) )
+ dt_device_set_protected(dev_to_dt(dev));
+
#ifdef CONFIG_HAS_PCI
if ( dev_is_pci(dev) )
{
@@ -1377,26 +1653,28 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
struct pci_host_bridge *bridge;
struct iommu_fwspec *fwspec_bridge;
unsigned int utlb_osid0 = 0;
- int ret;
bridge = pci_find_host_bridge(pdev->seg, pdev->bus);
if ( !bridge )
{
dev_err(dev, "Failed to find host bridge\n");
- return -ENODEV;
+ ret = -ENODEV;
+ goto free_suspend_ctx;
}
fwspec_bridge = dev_iommu_fwspec_get(dt_to_dev(bridge->dt_node));
if ( fwspec_bridge->num_ids < 1 )
{
dev_err(dev, "Failed to find host bridge uTLB\n");
- return -ENXIO;
+ ret = -ENXIO;
+ goto free_suspend_ctx;
}
if ( fwspec->num_ids < 1 )
{
dev_err(dev, "Failed to find uTLB");
- return -ENXIO;
+ ret = -ENXIO;
+ goto free_suspend_ctx;
}
rcar4_pcie_osid_regs_init(bridge);
@@ -1405,7 +1683,7 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
if ( ret < 0 )
{
dev_err(dev, "No unused OSID regs\n");
- return ret;
+ goto free_suspend_ctx;
}
reg_id = ret;
@@ -1420,7 +1698,7 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
{
rcar4_pcie_osid_bdf_clear(bridge, reg_id);
rcar4_pcie_osid_reg_free(bridge, reg_id);
- return ret;
+ goto free_suspend_ctx;
}
}
#endif
@@ -1429,6 +1707,13 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
dev_name(fwspec->iommu_dev), fwspec->num_ids);
return 0;
+#ifdef CONFIG_HAS_PCI
+ free_suspend_ctx:
+#ifdef CONFIG_SYSTEM_SUSPEND
+ ipmmu_free_ctx_suspend(dev);
+#endif
+ return ret;
+#endif
}
static int ipmmu_iommu_domain_init(struct domain *d)
@@ -1490,6 +1775,10 @@ static const struct iommu_ops ipmmu_iommu_ops =
.unmap_page = arm_iommu_unmap_page,
.dt_xlate = ipmmu_dt_xlate,
.add_device = ipmmu_add_device,
+#ifdef CONFIG_SYSTEM_SUSPEND
+ .suspend = ipmmu_suspend,
+ .resume = ipmmu_resume,
+#endif
};
static __init int ipmmu_init(struct dt_device_node *node, const void *data)
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 08/13] iommu/ipmmu-vmsa: Implement suspend/resume callbacks
2026-05-12 17:07 ` [PATCH v9 08/13] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
@ 2026-05-14 15:57 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-14 15:57 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
>
> diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> index fa9ab9cb13..e1b47a5824 100644
> --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> @@ -71,6 +71,8 @@
> })
> #endif
>
> +#define dev_dbg(dev, fmt, ...) \
> + dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__)
> #define dev_info(dev, fmt, ...) \
> dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
> #define dev_warn(dev, fmt, ...) \
> @@ -130,6 +132,24 @@ struct ipmmu_features {
> unsigned int imuctr_ttsel_mask;
> };
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +struct ipmmu_reg_ctx {
> + unsigned int imttlbr0;
> + unsigned int imttubr0;
> + unsigned int imttbcr;
> + unsigned int imctr;
> +};
> +
> +struct ipmmu_vmsa_backup {
> + struct device *dev;
> + unsigned int *utlbs_val;
> + unsigned int *asids_val;
> + struct list_head list;
> +};
> +
> +#endif
> +
> /* Root/Cache IPMMU device's information */
> struct ipmmu_vmsa_device {
> struct device *dev;
> @@ -142,6 +162,9 @@ struct ipmmu_vmsa_device {
> struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
> unsigned int utlb_refcount[IPMMU_UTLB_MAX];
> const struct ipmmu_features *features;
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + struct ipmmu_reg_ctx *reg_backup[IPMMU_CTX_MAX];
> +#endif
> };
>
> /*
> @@ -547,6 +570,245 @@ static void ipmmu_domain_free_context(struct ipmmu_vmsa_device *mmu,
> spin_unlock_irqrestore(&mmu->lock, flags);
> }
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +static DEFINE_SPINLOCK(ipmmu_devices_backup_lock);
> +static LIST_HEAD(ipmmu_devices_backup);
> +
> +static struct ipmmu_reg_ctx root_pgtable[IPMMU_CTX_MAX];
> +
> +static uint32_t ipmmu_imuasid_read(struct ipmmu_vmsa_device *mmu,
> + unsigned int utlb)
> +{
> + return ipmmu_read(mmu, ipmmu_utlb_reg(mmu, IMUASID(utlb)));
> +}
> +
> +static void ipmmu_utlbs_backup(struct ipmmu_vmsa_device *mmu)
> +{
> + struct ipmmu_vmsa_backup *backup_data;
> +
> + dev_dbg(mmu->dev, "Handle micro-TLBs backup\n");
> +
> + spin_lock(&ipmmu_devices_backup_lock);
> +
> + list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
> + {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
> + unsigned int i;
> +
> + if ( to_ipmmu(backup_data->dev) != mmu )
> + continue;
> +
> + for ( i = 0; i < fwspec->num_ids; i++ )
> + {
> + unsigned int utlb = fwspec->ids[i];
> +
> + backup_data->asids_val[i] = ipmmu_imuasid_read(mmu, utlb);
> + backup_data->utlbs_val[i] = ipmmu_imuctr_read(mmu, utlb);
> + }
> + }
> +
> + spin_unlock(&ipmmu_devices_backup_lock);
> +}
> +
> +static void ipmmu_utlbs_restore(struct ipmmu_vmsa_device *mmu)
> +{
> + struct ipmmu_vmsa_backup *backup_data;
> +
> + dev_dbg(mmu->dev, "Handle micro-TLBs restore\n");
> +
> + spin_lock(&ipmmu_devices_backup_lock);
> +
> + list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
> + {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
> + unsigned int i;
> +
> + if ( to_ipmmu(backup_data->dev) != mmu )
> + continue;
> +
> + for ( i = 0; i < fwspec->num_ids; i++ )
> + {
> + unsigned int utlb = fwspec->ids[i];
> +
> + ipmmu_imuasid_write(mmu, utlb, backup_data->asids_val[i]);
> + ipmmu_imuctr_write(mmu, utlb, backup_data->utlbs_val[i]);
> + }
> + }
> +
> + spin_unlock(&ipmmu_devices_backup_lock);
> +}
> +
> +static void ipmmu_domain_backup_context(struct ipmmu_vmsa_domain *domain)
> +{
> + struct ipmmu_vmsa_device *mmu = domain->mmu->root;
> + struct ipmmu_reg_ctx *regs = mmu->reg_backup[domain->context_id];
> +
> + dev_dbg(mmu->dev, "Handle domain context %u backup\n", domain->context_id);
> +
> + regs->imttlbr0 = ipmmu_ctx_read_root(domain, IMTTLBR0);
> + regs->imttubr0 = ipmmu_ctx_read_root(domain, IMTTUBR0);
> + regs->imttbcr = ipmmu_ctx_read_root(domain, IMTTBCR);
> + regs->imctr = ipmmu_ctx_read_root(domain, IMCTR);
> +}
> +
> +static void ipmmu_domain_restore_context(struct ipmmu_vmsa_domain *domain)
> +{
> + struct ipmmu_vmsa_device *mmu = domain->mmu->root;
> + struct ipmmu_reg_ctx *regs = mmu->reg_backup[domain->context_id];
NIT: There is a double space before the `=`
> +
> + dev_dbg(mmu->dev, "Handle domain context %u restore\n", domain->context_id);
> +
> + ipmmu_ctx_write_root(domain, IMTTLBR0, regs->imttlbr0);
> + ipmmu_ctx_write_root(domain, IMTTUBR0, regs->imttubr0);
> + ipmmu_ctx_write_root(domain, IMTTBCR, regs->imttbcr);
> + ipmmu_ctx_write_all(domain, IMCTR, regs->imctr | IMCTR_FLUSH);
I see in ipmmu_tlb_invalidate() we do:
dsb(sy);
ipmmu_tlb_sync(domain);
Is it safe to omit them here?
> +}
> +
> +/*
> + * Xen: Unlike Linux implementation, Xen uses a single driver instance
> + * for handling all IPMMUs. There is no framework for ipmmu_suspend/resume
> + * callbacks to be invoked for each IPMMU device. So, we need to iterate
> + * through all registered IPMMUs performing required actions.
> + *
> + * Also take care of restoring special settings, such as translation
> + * table format, etc.
> + */
> +static int __must_check ipmmu_suspend(void)
> +{
> + struct ipmmu_vmsa_device *mmu;
> +
> + if ( !iommu_enabled )
> + return 0;
> +
> + printk(XENLOG_DEBUG "ipmmu: Suspending...\n");
> +
> + spin_lock(&ipmmu_devices_lock);
> +
> + list_for_each_entry( mmu, &ipmmu_devices, list )
> + {
> + if ( ipmmu_is_root(mmu) )
> + {
> + unsigned int i;
> +
> + for ( i = 0; i < mmu->num_ctx; i++ )
> + {
> + if ( !mmu->domains[i] )
> + continue;
> + ipmmu_domain_backup_context(mmu->domains[i]);
> + }
> + }
> + else
> + ipmmu_utlbs_backup(mmu);
> + }
> +
> + spin_unlock(&ipmmu_devices_lock);
> +
> + return 0;
> +}
> +
> +static void ipmmu_resume(void)
> +{
> + struct ipmmu_vmsa_device *mmu;
> +
> + if ( !iommu_enabled )
> + return;
> +
> + printk(XENLOG_DEBUG "ipmmu: Resuming...\n");
> +
> + spin_lock(&ipmmu_devices_lock);
> +
> + list_for_each_entry( mmu, &ipmmu_devices, list )
This loop has an ordering problem because we can run ipmmu_utlbs_restore() before
the root ipmmu is restored (ipmmu_probe() uses `list_add()`).
Maybe going twice on the list, restoring first the root and in the second round the rest
should work.
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 09/13] xen/arm: smmu-v3: add suspend/resume handlers
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (7 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 08/13] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-14 16:41 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 10/13] xen/arm64: Save/restore CPU context across SYSTEM_SUSPEND Mykola Kvach
` (3 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Bertrand Marquis, Rahul Singh, Stefano Stabellini,
Julien Grall, Michal Orzel, Volodymyr Babchuk,
Pranjal Shrivastava
Add system suspend/resume callbacks for the Arm SMMUv3 driver.
During suspend, configure GBPA to abort incoming transactions, disable the
translation interface while keeping CMDQ enabled, issue CMD_SYNC to ensure
all previously issued commands have completed, then disable the SMMU.
Resume uses arm_smmu_device_reset() to reprogram the SMMU and re-enable
translation and interrupt generation.
The IRQ setup split follows the approach from Pranjal Shrivastava's Linux
arm-smmu-v3 runtime/system sleep series: IRQ handlers are requested once
during probe, while reset/resume only restores SMMU hardware state and
re-enables IRQ_CTRL.
Only the pieces relevant to Xen's currently supported SMMUv3 path are
ported here. Xen documents SMMUv3 MSI and PCI ATS as unsupported and not
compiled/tested, so this patch does not restore SMMU MSI IRQ_CFGn registers
nor reinitialize ATS/PRI endpoints. If those paths become usable,
suspend/resume will need corresponding MSI restore and ATS/PRI
quiesce/reinit steps.
Link: https://lore.kernel.org/r/20260414194702.1229094-1-praan@google.com/
Based-on-patch-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V9:
- Use CMD_SYNC in suspend instead of polling CMDQ_CONS, so the suspend
path waits for command completion rather than only command consumption.
- Document that arm_smmu_setup_irqs() is probe-only and that future Xen
SMMUv3 MSI support will need to restore SMMU IRQ_CFGn registers on
resume.
- Restore the reference to Pranjal's Linux runtime/system sleep series and
clarify that MSI/ATS/PRI resume handling is outside the supported Xen
path.
- Prefix the subject with xen/arm for consistency with the rest of the
Arm suspend/resume series.
Changes in V8:
- Honor ARM_SMMU_FEAT_SEV when draining the CMDQ during suspend, matching
the existing runtime CMD_SYNC path.
- Fold the suspend rollback reset path into a helper and rename the error
reporting to describe suspend rollback rather than resume.
- Treat SMMU reset failure during resume as fatal instead of logging and
continuing with a potentially unusable IOMMU.
- cosmetic changes
---
xen/drivers/passthrough/arm/smmu-v3.c | 178 ++++++++++++++++++++------
1 file changed, 142 insertions(+), 36 deletions(-)
diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c
index bf153227db..82c8ead979 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -1814,8 +1814,7 @@ static int arm_smmu_write_reg_sync(struct arm_smmu_device *smmu, u32 val,
}
/* GBPA is "special" */
-static int __init arm_smmu_update_gbpa(struct arm_smmu_device *smmu,
- u32 set, u32 clr)
+static int arm_smmu_update_gbpa(struct arm_smmu_device *smmu, u32 set, u32 clr)
{
int ret;
u32 reg, __iomem *gbpa = smmu->base + ARM_SMMU_GBPA;
@@ -1995,10 +1994,35 @@ err_free_evtq_irq:
return ret;
}
+static int arm_smmu_enable_irqs(struct arm_smmu_device *smmu)
+{
+ int ret;
+ u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
+
+ if ( smmu->features & ARM_SMMU_FEAT_PRI )
+ irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
+
+ /* Enable interrupt generation on the SMMU */
+ ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
+ ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
+ if ( ret )
+ {
+ dev_warn(smmu->dev, "failed to enable irqs\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+/*
+ * Probe-time only: request host IRQs and, when available, program the SMMU's
+ * MSI doorbells. Resume does not restore the SMMU *_IRQ_CFGn MSI registers,
+ * so any host suspend support must treat the active MSI IRQ path as
+ * unsupported until that restore path exists.
+ */
static int __init arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
{
int ret, irq;
- u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
/* Disable IRQs first */
ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
@@ -2028,22 +2052,7 @@ static int __init arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
}
}
- if (smmu->features & ARM_SMMU_FEAT_PRI)
- irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
-
- /* Enable interrupt generation on the SMMU */
- ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
- ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
- if (ret) {
- dev_warn(smmu->dev, "failed to enable irqs\n");
- goto err_free_irqs;
- }
-
return 0;
-
-err_free_irqs:
- arm_smmu_free_irqs(smmu);
- return ret;
}
static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
@@ -2057,7 +2066,7 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
return ret;
}
-static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu)
+static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
{
int ret;
u32 reg, enables;
@@ -2163,17 +2172,9 @@ static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu)
}
}
- ret = arm_smmu_setup_irqs(smmu);
- if (ret) {
- dev_err(smmu->dev, "failed to setup irqs\n");
+ ret = arm_smmu_enable_irqs(smmu);
+ if ( ret )
return ret;
- }
-
- /* Initialize tasklets for threaded IRQs*/
- tasklet_init(&smmu->evtq_irq_tasklet, arm_smmu_evtq_tasklet, smmu);
- tasklet_init(&smmu->priq_irq_tasklet, arm_smmu_priq_tasklet, smmu);
- tasklet_init(&smmu->combined_irq_tasklet, arm_smmu_combined_irq_tasklet,
- smmu);
/* Enable the SMMU interface, or ensure bypass */
if (disable_bypass) {
@@ -2181,20 +2182,16 @@ static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu)
} else {
ret = arm_smmu_update_gbpa(smmu, 0, GBPA_ABORT);
if (ret)
- goto err_free_irqs;
+ return ret;
}
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
ARM_SMMU_CR0ACK);
if (ret) {
dev_err(smmu->dev, "failed to enable SMMU interface\n");
- goto err_free_irqs;
+ return ret;
}
return 0;
-
-err_free_irqs:
- arm_smmu_free_irqs(smmu);
- return ret;
}
static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
@@ -2558,10 +2555,23 @@ static int __init arm_smmu_device_probe(struct platform_device *pdev)
if (ret)
goto out_free;
+ ret = arm_smmu_setup_irqs(smmu);
+ if ( ret )
+ {
+ dev_err(smmu->dev, "failed to setup irqs\n");
+ goto out_free;
+ }
+
+ /* Initialize tasklets for threaded IRQs*/
+ tasklet_init(&smmu->evtq_irq_tasklet, arm_smmu_evtq_tasklet, smmu);
+ tasklet_init(&smmu->priq_irq_tasklet, arm_smmu_priq_tasklet, smmu);
+ tasklet_init(&smmu->combined_irq_tasklet, arm_smmu_combined_irq_tasklet,
+ smmu);
+
/* Reset the device */
ret = arm_smmu_device_reset(smmu);
if (ret)
- goto out_free;
+ goto out_free_irqs;
/*
* Keep a list of all probed devices. This will be used to query
@@ -2575,6 +2585,8 @@ static int __init arm_smmu_device_probe(struct platform_device *pdev)
return 0;
+out_free_irqs:
+ arm_smmu_free_irqs(smmu);
out_free:
arm_smmu_free_structures(smmu);
@@ -2855,6 +2867,96 @@ static void arm_smmu_iommu_xen_domain_teardown(struct domain *d)
xfree(xen_domain);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+static void arm_smmu_reset_for_suspend_rollback(struct arm_smmu_device *smmu)
+{
+ int ret = arm_smmu_device_reset(smmu);
+
+ if ( ret )
+ dev_err(smmu->dev, "Failed to reset during suspend rollback: %d\n",
+ ret);
+}
+
+static int arm_smmu_suspend(void)
+{
+ struct arm_smmu_device *smmu;
+ int ret = 0;
+
+ list_for_each_entry(smmu, &arm_smmu_devices, devices)
+ {
+ /* Abort all transactions before disable to avoid spurious bypass */
+ ret = arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
+ if ( ret )
+ goto fail;
+
+ /* Disable the SMMU via CR0.EN and all queues except CMDQ */
+ ret = arm_smmu_write_reg_sync(smmu, CR0_CMDQEN, ARM_SMMU_CR0,
+ ARM_SMMU_CR0ACK);
+ if ( ret )
+ {
+ dev_err(smmu->dev, "Timed-out while disabling smmu\n");
+ goto fail;
+ }
+
+ /*
+ * At this point the translation interface is disabled and the
+ * SMMU won't access translation/config structures, even
+ * speculatively, as per the IHI0070 spec (section 6.3.9.6).
+ * CMDQ is still enabled so that a CMD_SYNC can complete any
+ * previously issued commands.
+ */
+
+ /* Ensure all previously issued commands have completed. */
+ ret = arm_smmu_cmdq_issue_sync(smmu);
+ if ( ret )
+ {
+ dev_err(smmu->dev, "Timed-out waiting for pending commands\n");
+ goto fail;
+ }
+
+ /* Disable everything */
+ ret = arm_smmu_device_disable(smmu);
+ if ( ret )
+ goto fail;
+
+ dev_dbg(smmu->dev, "Suspended smmu\n");
+ }
+
+ return 0;
+
+ fail:
+ /* Reset the device that failed as well as any already-suspended ones. */
+ arm_smmu_reset_for_suspend_rollback(smmu);
+
+ list_for_each_entry_continue_reverse(smmu, &arm_smmu_devices, devices)
+ arm_smmu_reset_for_suspend_rollback(smmu);
+
+ return ret;
+}
+
+static void arm_smmu_resume(void)
+{
+ int ret;
+ struct arm_smmu_device *smmu;
+
+ list_for_each_entry(smmu, &arm_smmu_devices, devices)
+ {
+ dev_dbg(smmu->dev, "Resuming device\n");
+
+ /*
+ * The reset will re-initialize all the base addresses, queues,
+ * prod and cons maintained within struct arm_smmu_device as well as
+ * re-enable the interrupts.
+ */
+ ret = arm_smmu_device_reset(smmu);
+ if ( ret )
+ panic("SMMUv3: %s: Failed to reset during resume: %d\n",
+ dev_name(smmu->dev), ret);
+ }
+}
+#endif
+
static const struct iommu_ops arm_smmu_iommu_ops = {
.page_sizes = PAGE_SIZE_4K,
.init = arm_smmu_iommu_xen_domain_init,
@@ -2867,6 +2969,10 @@ static const struct iommu_ops arm_smmu_iommu_ops = {
.unmap_page = arm_iommu_unmap_page,
.dt_xlate = arm_smmu_dt_xlate,
.add_device = arm_smmu_add_device,
+#ifdef CONFIG_SYSTEM_SUSPEND
+ .suspend = arm_smmu_suspend,
+ .resume = arm_smmu_resume,
+#endif
};
static __init int arm_smmu_dt_init(struct dt_device_node *dev,
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 09/13] xen/arm: smmu-v3: add suspend/resume handlers
2026-05-12 17:07 ` [PATCH v9 09/13] xen/arm: smmu-v3: add suspend/resume handlers Mykola Kvach
@ 2026-05-14 16:41 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-14 16:41 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Bertrand Marquis,
Rahul Singh, Stefano Stabellini, Julien Grall, Michal Orzel,
Volodymyr Babchuk, Pranjal Shrivastava
Hi Mykola,
>
> +static int arm_smmu_enable_irqs(struct arm_smmu_device *smmu)
> +{
> + int ret;
> + u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
> +
> + if ( smmu->features & ARM_SMMU_FEAT_PRI )
> + irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
> +
> + /* Enable interrupt generation on the SMMU */
> + ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
> + ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
> + if ( ret )
> + {
> + dev_warn(smmu->dev, "failed to enable irqs\n");
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * Probe-time only: request host IRQs and, when available, program the SMMU's
> + * MSI doorbells. Resume does not restore the SMMU *_IRQ_CFGn MSI registers,
> + * so any host suspend support must treat the active MSI IRQ path as
> + * unsupported until that restore path exists.
> + */
> static int __init arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
> {
> int ret, irq;
> - u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
>
> /* Disable IRQs first */
> ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
> @@ -2028,22 +2052,7 @@ static int __init arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
> }
> }
>
> - if (smmu->features & ARM_SMMU_FEAT_PRI)
> - irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
> -
> - /* Enable interrupt generation on the SMMU */
> - ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
> - ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
> - if (ret) {
> - dev_warn(smmu->dev, "failed to enable irqs\n");
> - goto err_free_irqs;
> - }
> -
> return 0;
> -
> -err_free_irqs:
> - arm_smmu_free_irqs(smmu);
> - return ret;
> }
>
> static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
> @@ -2057,7 +2066,7 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
> return ret;
> }
>
> -static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu)
> +static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
> {
> int ret;
> u32 reg, enables;
> @@ -2163,17 +2172,9 @@ static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu)
> }
> }
>
> - ret = arm_smmu_setup_irqs(smmu);
> - if (ret) {
> - dev_err(smmu->dev, "failed to setup irqs\n");
> + ret = arm_smmu_enable_irqs(smmu);
> + if ( ret )
> return ret;
> - }
> -
> - /* Initialize tasklets for threaded IRQs*/
> - tasklet_init(&smmu->evtq_irq_tasklet, arm_smmu_evtq_tasklet, smmu);
> - tasklet_init(&smmu->priq_irq_tasklet, arm_smmu_priq_tasklet, smmu);
> - tasklet_init(&smmu->combined_irq_tasklet, arm_smmu_combined_irq_tasklet,
> - smmu);
>
> /* Enable the SMMU interface, or ensure bypass */
> if (disable_bypass) {
> @@ -2181,20 +2182,16 @@ static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu)
> } else {
> ret = arm_smmu_update_gbpa(smmu, 0, GBPA_ABORT);
> if (ret)
> - goto err_free_irqs;
> + return ret;
> }
> ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
> ARM_SMMU_CR0ACK);
> if (ret) {
> dev_err(smmu->dev, "failed to enable SMMU interface\n");
> - goto err_free_irqs;
> + return ret;
> }
>
> return 0;
> -
> -err_free_irqs:
> - arm_smmu_free_irqs(smmu);
> - return ret;
> }
>
> static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
> @@ -2558,10 +2555,23 @@ static int __init arm_smmu_device_probe(struct platform_device *pdev)
> if (ret)
> goto out_free;
>
> + ret = arm_smmu_setup_irqs(smmu);
> + if ( ret )
> + {
> + dev_err(smmu->dev, "failed to setup irqs\n");
> + goto out_free;
> + }
> +
> + /* Initialize tasklets for threaded IRQs*/
> + tasklet_init(&smmu->evtq_irq_tasklet, arm_smmu_evtq_tasklet, smmu);
> + tasklet_init(&smmu->priq_irq_tasklet, arm_smmu_priq_tasklet, smmu);
> + tasklet_init(&smmu->combined_irq_tasklet, arm_smmu_combined_irq_tasklet,
> + smmu);
> +
> /* Reset the device */
> ret = arm_smmu_device_reset(smmu);
> if (ret)
> - goto out_free;
> + goto out_free_irqs;
>
> /*
> * Keep a list of all probed devices. This will be used to query
> @@ -2575,6 +2585,8 @@ static int __init arm_smmu_device_probe(struct platform_device *pdev)
>
> return 0;
>
> +out_free_irqs:
> + arm_smmu_free_irqs(smmu);
>
> out_free:
> arm_smmu_free_structures(smmu);
> @@ -2855,6 +2867,96 @@ static void arm_smmu_iommu_xen_domain_teardown(struct domain *d)
> xfree(xen_domain);
> }
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +static void arm_smmu_reset_for_suspend_rollback(struct arm_smmu_device *smmu)
> +{
> + int ret = arm_smmu_device_reset(smmu);
> +
> + if ( ret )
> + dev_err(smmu->dev, "Failed to reset during suspend rollback: %d\n",
> + ret);
> +}
> +
> +static int arm_smmu_suspend(void)
> +{
> + struct arm_smmu_device *smmu;
> + int ret = 0;
> +
> + list_for_each_entry(smmu, &arm_smmu_devices, devices)
> + {
> + /* Abort all transactions before disable to avoid spurious bypass */
> + ret = arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> + if ( ret )
> + goto fail;
Should we have this here as the restore path calls arm_smmu_enable_irqs()?
ret = arm_smmu_write_reg_sync(smmu, 0,
ARM_SMMU_IRQ_CTRL,
ARM_SMMU_IRQ_CTRLACK);
if ( ret )
goto fail;
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 10/13] xen/arm64: Save/restore CPU context across SYSTEM_SUSPEND
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (8 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 09/13] xen/arm: smmu-v3: add suspend/resume handlers Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-14 17:20 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 11/13] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
` (2 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Mirela Simonovic <mirela.simonovic@aggios.com>
On wakeup from PSCI SYSTEM_SUSPEND, Xen re-enters EL2 with the MMU and
data cache disabled. The resume path must first switch back to Xen's
runtime page tables before it can access the saved CPU context using
virtual addresses.
Add an arm64 hyp_resume trampoline that reuses enable_secondary_cpu_mm()
to enable the data cache and MMU, switch to init_ttbr, and resume in the
runtime virtual mapping. The trampoline then restores the saved CPU
general-purpose and system-control register context.
prepare_resume_ctx() must be invoked just before the PSCI system suspend
call is issued to the platform firmware. It saves the current CPU context
and returns a non-zero value so that the caller enters the physical
SYSTEM_SUSPEND call.
On resume, hyp_resume restores the saved context, including the saved link
register. Control therefore returns to the place where prepare_resume_ctx()
was called. To avoid re-entering the suspend path, the restored path sees
prepare_resume_ctx() return zero.
The assembly save/restore code uses offsets generated by asm-offsets.c
from struct resume_cpu_context, keeping the assembly memory accesses in
sync with the C structure layout.
Support for ARM32 is not implemented. Instead, compilation fails with a
build-time error if suspend is enabled for ARM32.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v9:
- Drop the misleading prepare_resume_ctx() pointer argument and make both
save/restore paths use the global resume_cpu_context.
- Squash the arm64 resume trampoline into the context save/restore patch.
- Document in code that hyp_resume relies on PSCI initial-state rules.
- Use generic platform firmware wording instead of ATF-specific wording.
- Rename the saved context type/storage to resume_cpu_context and rely on
implicit zero-initialization for the file-scope object.
- Use asm-offsets.c-generated RESUME_CTX_* offsets to keep the assembly
save/restore code in sync with struct resume_cpu_context.
Changes in v8:
- Fix alignments in code.
Changes in v7:
- No functional changes, just moved commit.
---
xen/arch/arm/Makefile | 1 +
xen/arch/arm/arm64/asm-offsets.c | 20 +++++
xen/arch/arm/arm64/head.S | 118 +++++++++++++++++++++++++++++
xen/arch/arm/include/asm/suspend.h | 26 +++++++
xen/arch/arm/suspend.c | 14 ++++
5 files changed, 179 insertions(+)
create mode 100644 xen/arch/arm/suspend.c
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 982c6c396a..c97df7f3a0 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -51,6 +51,7 @@ obj-y += setup.o
obj-y += shutdown.o
obj-y += smp.o
obj-y += smpboot.o
+obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
obj-$(CONFIG_SYSCTL) += sysctl.o
obj-y += time.o
obj-y += traps.o
diff --git a/xen/arch/arm/arm64/asm-offsets.c b/xen/arch/arm/arm64/asm-offsets.c
index 38a3894a3b..4da5fff584 100644
--- a/xen/arch/arm/arm64/asm-offsets.c
+++ b/xen/arch/arm/arm64/asm-offsets.c
@@ -13,6 +13,7 @@
#include <asm/mm.h>
#include <asm/setup.h>
#include <asm/smccc.h>
+#include <asm/suspend.h>
#define DEFINE(_sym, _val) \
asm volatile ( "\n.ascii\"==>#define " #_sym " %0 /* " #_val " */<==\""\
@@ -57,6 +58,25 @@ void __dummy__(void)
OFFSET(INITINFO_stack, struct init_info, stack);
BLANK();
+#ifdef CONFIG_SYSTEM_SUSPEND
+ OFFSET(RESUME_CTX_X19, struct resume_cpu_context, callee_regs[0]);
+ OFFSET(RESUME_CTX_X21, struct resume_cpu_context, callee_regs[2]);
+ OFFSET(RESUME_CTX_X23, struct resume_cpu_context, callee_regs[4]);
+ OFFSET(RESUME_CTX_X25, struct resume_cpu_context, callee_regs[6]);
+ OFFSET(RESUME_CTX_X27, struct resume_cpu_context, callee_regs[8]);
+ OFFSET(RESUME_CTX_X29, struct resume_cpu_context, callee_regs[10]);
+ OFFSET(RESUME_CTX_SP, struct resume_cpu_context, sp);
+ OFFSET(RESUME_CTX_VBAR_EL2, struct resume_cpu_context, vbar_el2);
+ OFFSET(RESUME_CTX_VTCR_EL2, struct resume_cpu_context, vtcr_el2);
+ OFFSET(RESUME_CTX_VTTBR_EL2, struct resume_cpu_context, vttbr_el2);
+ OFFSET(RESUME_CTX_TPIDR_EL2, struct resume_cpu_context, tpidr_el2);
+ OFFSET(RESUME_CTX_MDCR_EL2, struct resume_cpu_context, mdcr_el2);
+ OFFSET(RESUME_CTX_HSTR_EL2, struct resume_cpu_context, hstr_el2);
+ OFFSET(RESUME_CTX_CPTR_EL2, struct resume_cpu_context, cptr_el2);
+ OFFSET(RESUME_CTX_HCR_EL2, struct resume_cpu_context, hcr_el2);
+ BLANK();
+#endif
+
OFFSET(SMCCC_RES_a0, struct arm_smccc_res, a0);
OFFSET(SMCCC_RES_a2, struct arm_smccc_res, a2);
OFFSET(ARM_SMCCC_1_2_REGS_X0_OFFS, struct arm_smccc_1_2_regs, a0);
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 72c7b24498..512a3c35b2 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -561,6 +561,124 @@ END(efi_xen_start)
#endif /* CONFIG_ARM_EFI */
+#ifdef CONFIG_SYSTEM_SUSPEND
+/*
+ * int prepare_resume_ctx(void)
+ *
+ * CPU context saved here will be restored on resume in hyp_resume function.
+ * prepare_resume_ctx shall return a non-zero value. Upon restoring context
+ * hyp_resume shall return value zero instead. From C code that invokes
+ * prepare_resume_ctx, the return value is interpreted to determine whether
+ * the context is saved (prepare_resume_ctx) or restored (hyp_resume).
+ */
+FUNC(prepare_resume_ctx)
+ ldr x0, =resume_cpu_context
+
+ /* Store callee-saved registers */
+ stp x19, x20, [x0, #RESUME_CTX_X19]
+ stp x21, x22, [x0, #RESUME_CTX_X21]
+ stp x23, x24, [x0, #RESUME_CTX_X23]
+ stp x25, x26, [x0, #RESUME_CTX_X25]
+ stp x27, x28, [x0, #RESUME_CTX_X27]
+ stp x29, lr, [x0, #RESUME_CTX_X29]
+
+ /* Store stack-pointer */
+ mov x2, sp
+ str x2, [x0, #RESUME_CTX_SP]
+
+ /* Store system control registers */
+ mrs x2, VBAR_EL2
+ str x2, [x0, #RESUME_CTX_VBAR_EL2]
+ mrs x2, VTCR_EL2
+ str x2, [x0, #RESUME_CTX_VTCR_EL2]
+ mrs x2, VTTBR_EL2
+ str x2, [x0, #RESUME_CTX_VTTBR_EL2]
+ mrs x2, TPIDR_EL2
+ str x2, [x0, #RESUME_CTX_TPIDR_EL2]
+ mrs x2, MDCR_EL2
+ str x2, [x0, #RESUME_CTX_MDCR_EL2]
+ mrs x2, HSTR_EL2
+ str x2, [x0, #RESUME_CTX_HSTR_EL2]
+ mrs x2, CPTR_EL2
+ str x2, [x0, #RESUME_CTX_CPTR_EL2]
+ mrs x2, HCR_EL2
+ str x2, [x0, #RESUME_CTX_HCR_EL2]
+
+ /* prepare_resume_ctx must return a non-zero value */
+ mov x0, #1
+ ret
+END(prepare_resume_ctx)
+
+FUNC(hyp_resume)
+ /*
+ * PSCI states that SYSTEM_SUSPEND follows the CPU_SUSPEND initial
+ * state rules, so PSCI-compliant firmware must enter the return
+ * exception level with DAIF masked.
+ */
+
+ /* Initialize the UART if earlyprintk has been enabled. */
+#ifdef CONFIG_EARLY_PRINTK
+ bl init_uart
+#endif
+ PRINT_ID("- Xen resuming -\r\n")
+
+ bl check_cpu_mode
+ bl cpu_init
+
+ ldr x0, =start
+ adr x20, start /* x20 := paddr (start) */
+ sub x20, x20, x0 /* x20 := phys-offset */
+ ldr lr, =mmu_resumed
+ b enable_secondary_cpu_mm
+
+mmu_resumed:
+ /* Now we can access the saved context, so restore it here. */
+ ldr x0, =resume_cpu_context
+
+ /* Restore callee-saved registers */
+ ldp x19, x20, [x0, #RESUME_CTX_X19]
+ ldp x21, x22, [x0, #RESUME_CTX_X21]
+ ldp x23, x24, [x0, #RESUME_CTX_X23]
+ ldp x25, x26, [x0, #RESUME_CTX_X25]
+ ldp x27, x28, [x0, #RESUME_CTX_X27]
+ ldp x29, lr, [x0, #RESUME_CTX_X29]
+
+ /* Restore stack pointer */
+ ldr x2, [x0, #RESUME_CTX_SP]
+ mov sp, x2
+
+ /* Restore system control registers */
+ ldr x2, [x0, #RESUME_CTX_VBAR_EL2]
+ msr VBAR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_VTCR_EL2]
+ msr VTCR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_VTTBR_EL2]
+ msr VTTBR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_TPIDR_EL2]
+ msr TPIDR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_MDCR_EL2]
+ msr MDCR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_HSTR_EL2]
+ msr HSTR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_CPTR_EL2]
+ msr CPTR_EL2, x2
+ ldr x2, [x0, #RESUME_CTX_HCR_EL2]
+ msr HCR_EL2, x2
+ isb
+
+ /*
+ * Since context is restored return from this function will appear
+ * as return from prepare_resume_ctx. To distinguish a return from
+ * prepare_resume_ctx which is called upon finalizing the suspend,
+ * as opposed to return from this function which executes on resume,
+ * we need to return zero value here.
+ */
+ mov x0, #0
+ ret
+END(hyp_resume)
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/*
* Local variables:
* mode: ASM
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 31a98a1f1b..2d9fc331fc 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -3,6 +3,8 @@
#ifndef ARM_SUSPEND_H
#define ARM_SUSPEND_H
+#include <xen/types.h>
+
struct domain;
struct vcpu;
struct vcpu_guest_context;
@@ -14,6 +16,30 @@ struct resume_info {
void arch_domain_resume(struct domain *d);
+#ifdef CONFIG_SYSTEM_SUSPEND
+#ifdef CONFIG_ARM_64
+struct resume_cpu_context {
+ register_t callee_regs[12];
+ register_t sp;
+ register_t vbar_el2;
+ register_t vtcr_el2;
+ register_t vttbr_el2;
+ register_t tpidr_el2;
+ register_t mdcr_el2;
+ register_t hstr_el2;
+ register_t cptr_el2;
+ register_t hcr_el2;
+} __aligned(16);
+#else
+#error "Define resume_cpu_context structure for arm32"
+#endif
+
+extern struct resume_cpu_context resume_cpu_context;
+
+int prepare_resume_ctx(void);
+void hyp_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
#endif /* ARM_SUSPEND_H */
/*
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
new file mode 100644
index 0000000000..6ea4a0f9cc
--- /dev/null
+++ b/xen/arch/arm/suspend.c
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <asm/suspend.h>
+
+struct resume_cpu_context resume_cpu_context;
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 10/13] xen/arm64: Save/restore CPU context across SYSTEM_SUSPEND
2026-05-12 17:07 ` [PATCH v9 10/13] xen/arm64: Save/restore CPU context across SYSTEM_SUSPEND Mykola Kvach
@ 2026-05-14 17:20 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-14 17:20 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +/*
> + * int prepare_resume_ctx(void)
> + *
> + * CPU context saved here will be restored on resume in hyp_resume function.
> + * prepare_resume_ctx shall return a non-zero value. Upon restoring context
> + * hyp_resume shall return value zero instead. From C code that invokes
> + * prepare_resume_ctx, the return value is interpreted to determine whether
> + * the context is saved (prepare_resume_ctx) or restored (hyp_resume).
> + */
> +FUNC(prepare_resume_ctx)
> + ldr x0, =resume_cpu_context
> +
> + /* Store callee-saved registers */
> + stp x19, x20, [x0, #RESUME_CTX_X19]
> + stp x21, x22, [x0, #RESUME_CTX_X21]
> + stp x23, x24, [x0, #RESUME_CTX_X23]
> + stp x25, x26, [x0, #RESUME_CTX_X25]
> + stp x27, x28, [x0, #RESUME_CTX_X27]
> + stp x29, lr, [x0, #RESUME_CTX_X29]
> +
> + /* Store stack-pointer */
> + mov x2, sp
> + str x2, [x0, #RESUME_CTX_SP]
> +
> + /* Store system control registers */
> + mrs x2, VBAR_EL2
> + str x2, [x0, #RESUME_CTX_VBAR_EL2]
> + mrs x2, VTCR_EL2
> + str x2, [x0, #RESUME_CTX_VTCR_EL2]
> + mrs x2, VTTBR_EL2
> + str x2, [x0, #RESUME_CTX_VTTBR_EL2]
> + mrs x2, TPIDR_EL2
> + str x2, [x0, #RESUME_CTX_TPIDR_EL2]
> + mrs x2, MDCR_EL2
> + str x2, [x0, #RESUME_CTX_MDCR_EL2]
> + mrs x2, HSTR_EL2
> + str x2, [x0, #RESUME_CTX_HSTR_EL2]
> + mrs x2, CPTR_EL2
> + str x2, [x0, #RESUME_CTX_CPTR_EL2]
> + mrs x2, HCR_EL2
> + str x2, [x0, #RESUME_CTX_HCR_EL2]
Do you think we should save also CNTHCTL_EL2? Apologies it escaped my first review,
but I see we program it in the boot cpu path + secondary cpu path: init_timer_interrupt().
The rest looks ok.
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 11/13] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface)
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (9 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 10/13] xen/arm64: Save/restore CPU context across SYSTEM_SUSPEND Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-15 7:04 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy Mykola Kvach
2026-05-12 17:07 ` [PATCH v9 13/13] xen/arm: Add host system suspend backend Mykola Kvach
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Mirela Simonovic <mirela.simonovic@aggios.com>
Invoke PSCI SYSTEM_SUSPEND to finalize Xen's suspend sequence on ARM64
platforms. Pass the Xen resume entry point (hyp_resume) to EL3 together
with a zero context ID, matching Linux.
This patch wires up only the host-side PSCI SYSTEM_SUSPEND invocation.
The resume trampoline and context restore are provided by earlier patches
in the series.
Only enable this path when CONFIG_SYSTEM_SUSPEND is set and PSCI
advertises SYSTEM_SUSPEND via PSCI_FEATURES.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v9:
- cache SYSTEM_SUSPEND support using PSCI_FEATURES and gate the host call
on the cached capability
- keep the cached SYSTEM_SUSPEND capability read-only after init
- log whether firmware reports SYSTEM_SUSPEND support
- pass an explicit zero context ID in the SYSTEM_SUSPEND call
- drop the stale note claiming hyp_resume is still a stub
---
xen/arch/arm/include/asm/psci.h | 1 +
xen/arch/arm/psci.c | 31 ++++++++++++++++++++++++++++++-
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 48a93e6b79..bb3c73496e 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -23,6 +23,7 @@ int call_psci_cpu_on(int cpu);
void call_psci_cpu_off(void);
void call_psci_system_off(void);
void call_psci_system_reset(void);
+int call_psci_system_suspend(void);
/* Range of allocated PSCI function numbers */
#define PSCI_FNUM_MIN_VALUE _AC(0,U)
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index b6860a7760..e05dae1133 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -17,23 +17,27 @@
#include <asm/cpufeature.h>
#include <asm/psci.h>
#include <asm/acpi.h>
+#include <asm/suspend.h>
/*
* While a 64-bit OS can make calls with SMC32 calling conventions, for
* some calls it is necessary to use SMC64 to pass or return 64-bit values.
- * For such calls PSCI_0_2_FN_NATIVE(x) will choose the appropriate
+ * For such calls PSCI_*_FN_NATIVE(x) will choose the appropriate
* (native-width) function ID.
*/
#ifdef CONFIG_ARM_64
#define PSCI_0_2_FN_NATIVE(name) PSCI_0_2_FN64_##name
+#define PSCI_1_0_FN_NATIVE(name) PSCI_1_0_FN64_##name
#else
#define PSCI_0_2_FN_NATIVE(name) PSCI_0_2_FN32_##name
+#define PSCI_1_0_FN_NATIVE(name) PSCI_1_0_FN32_##name
#endif
uint32_t psci_ver;
uint32_t smccc_ver;
static uint32_t psci_cpu_on_nr;
+static bool __ro_after_init has_psci_system_suspend;
#define PSCI_RET(res) ((int32_t)(res).a0)
@@ -60,6 +64,25 @@ void call_psci_cpu_off(void)
}
}
+int call_psci_system_suspend(void)
+{
+#ifdef CONFIG_SYSTEM_SUSPEND
+ struct arm_smccc_res res;
+
+ if ( !has_psci_system_suspend )
+ return PSCI_NOT_SUPPORTED;
+
+ /* Context ID is unused for the Xen resume path. */
+ arm_smccc_smc(PSCI_1_0_FN_NATIVE(SYSTEM_SUSPEND), __pa(hyp_resume), 0,
+ &res);
+ return PSCI_RET(res);
+#else
+ dprintk(XENLOG_WARNING,
+ "SYSTEM_SUSPEND not supported (CONFIG_SYSTEM_SUSPEND disabled)\n");
+ return PSCI_NOT_SUPPORTED;
+#endif
+}
+
void call_psci_system_off(void)
{
if ( psci_ver > PSCI_VERSION(0, 1) )
@@ -223,9 +246,15 @@ int __init psci_init(void)
psci_init_smccc();
+ has_psci_system_suspend =
+ psci_features(PSCI_1_0_FN_NATIVE(SYSTEM_SUSPEND)) == 0;
+
printk(XENLOG_INFO "Using PSCI v%u.%u\n",
PSCI_VERSION_MAJOR(psci_ver), PSCI_VERSION_MINOR(psci_ver));
+ printk(XENLOG_DEBUG "PSCI SYSTEM_SUSPEND is %ssupported by firmware\n",
+ has_psci_system_suspend ? "" : "not ");
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 11/13] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface)
2026-05-12 17:07 ` [PATCH v9 11/13] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
@ 2026-05-15 7:04 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-15 7:04 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
> On 12 May 2026, at 18:07, Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> Invoke PSCI SYSTEM_SUSPEND to finalize Xen's suspend sequence on ARM64
> platforms. Pass the Xen resume entry point (hyp_resume) to EL3 together
> with a zero context ID, matching Linux.
>
> This patch wires up only the host-side PSCI SYSTEM_SUSPEND invocation.
> The resume trampoline and context restore are provided by earlier patches
> in the series.
>
> Only enable this path when CONFIG_SYSTEM_SUSPEND is set and PSCI
> advertises SYSTEM_SUSPEND via PSCI_FEATURES.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (10 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 11/13] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-13 6:53 ` Jan Beulich
2026-05-15 8:17 ` Luca Fancellu
2026-05-12 17:07 ` [PATCH v9 13/13] xen/arm: Add host system suspend backend Mykola Kvach
12 siblings, 2 replies; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné, Rahul Singh
From: Mykola Kvach <mykola_kvach@epam.com>
Introduce CONFIG_HAS_HWDOM_SYSTEM_SUSPEND as an architecture-selected
capability for platforms where the hardware domain can be parked with
SHUTDOWN_suspend without calling hwdom_shutdown().
Expose PSCI SYSTEM_SUSPEND as a vPSCI operation for all domains. For
non-control domains, including the hardware domain when it is not acting as a
control domain, the call is handled as a guest/domain suspend request and
parks the domain in SHUTDOWN_suspend.
Control domains need additional sequencing because their SYSTEM_SUSPEND
request is used to coordinate host-wide suspend. A non-last awake control
domain may be parked in SHUTDOWN_suspend without requiring the host suspend
path to be available. The last awake control domain is treated as the point
where the request becomes a host-suspend request, and it may only proceed
when all non-control domains are already in SHUTDOWN_suspend and the host
suspend path is available.
Keep the control-domain sequencing and domain-readiness checks out of
PSCI_FEATURES. They are per-attempt runtime conditions rather than stable PSCI
function availability. Advertise SYSTEM_SUSPEND as implemented by vPSCI and
enforce the sequencing policy in the call handler.
Select HAS_HWDOM_SYSTEM_SUSPEND independently from CONFIG_SYSTEM_SUSPEND so
that SHUTDOWN_suspend from the hardware domain can be treated as a domain
suspend state rather than as a hardware-domain initiated host shutdown. This
does not by itself imply that host-wide suspend is available.
Add host_system_suspend_allowed() to combine the host PSCI SYSTEM_SUSPEND
capability with runtime blockers reported by Xen-owned subsystems. Add
runtime blockers for registered serial, IOMMU, GIC and SMMUv3 MSI IRQ paths
lacking suspend/resume support. These blockers are runtime based, so they
only apply to drivers or paths that Xen actually uses on the platform. For
SMMUv3, the blocker applies only when Xen actually uses the MSI IRQ path,
since resume does not restore the SMMU *_IRQ_CFGn MSI registers yet.
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/gic.c | 6 ++
xen/arch/arm/include/asm/psci.h | 3 +
xen/arch/arm/include/asm/suspend.h | 10 ++-
xen/arch/arm/psci.c | 7 ++
xen/arch/arm/suspend.c | 40 +++++++++
xen/arch/arm/vpsci.c | 114 +++++++++++++++++++++++---
xen/common/Kconfig | 3 +
xen/common/domain.c | 7 +-
xen/drivers/char/serial.c | 12 +++
xen/drivers/passthrough/arm/iommu.c | 4 +
xen/drivers/passthrough/arm/smmu-v3.c | 4 +
xen/include/xen/serial.h | 1 +
xen/include/xen/suspend.h | 2 +
14 files changed, 201 insertions(+), 13 deletions(-)
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 79622b46a1..54a5bfb9ae 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -19,6 +19,7 @@ config ARM
select HAS_ALTERNATIVE if HAS_VMAP
select HAS_DEVICE_TREE_DISCOVERY
select HAS_DOM0LESS
+ select HAS_HWDOM_SYSTEM_SUSPEND if !MPU
select HAS_GRANT_CACHE_FLUSH if GRANT_TABLE
select HAS_STACK_PROTECTOR
select HAS_UBSAN
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 7727ffed5a..a5efcaeb4c 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -26,6 +26,7 @@
#include <asm/device.h>
#include <asm/io.h>
#include <asm/gic.h>
+#include <asm/suspend.h>
#include <asm/vgic.h>
#include <asm/acpi.h>
@@ -44,6 +45,11 @@ static void __init __maybe_unused build_assertions(void)
void register_gic_ops(const struct gic_hw_operations *ops)
{
gic_hw_ops = ops;
+
+#ifdef CONFIG_SYSTEM_SUSPEND
+ if ( !ops->suspend || !ops->resume )
+ host_system_suspend_disable("GIC driver lacks suspend/resume support");
+#endif
}
static void clear_cpu_lr_mask(void)
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index bb3c73496e..142fa1bfe5 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -24,6 +24,9 @@ void call_psci_cpu_off(void);
void call_psci_system_off(void);
void call_psci_system_reset(void);
int call_psci_system_suspend(void);
+#ifdef CONFIG_SYSTEM_SUSPEND
+bool psci_system_suspend_allowed(void);
+#endif
/* Range of allocated PSCI function numbers */
#define PSCI_FNUM_MIN_VALUE _AC(0,U)
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 2d9fc331fc..87db12eac3 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -38,7 +38,15 @@ extern struct resume_cpu_context resume_cpu_context;
int prepare_resume_ctx(void);
void hyp_resume(void);
-#endif /* CONFIG_SYSTEM_SUSPEND */
+bool host_system_suspend_allowed(void);
+void host_system_suspend_disable(const char *reason);
+
+#else /* !CONFIG_SYSTEM_SUSPEND */
+
+static inline bool host_system_suspend_allowed(void) { return false; }
+static inline void host_system_suspend_disable(const char *reason) {}
+
+#endif
#endif /* ARM_SUSPEND_H */
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index e05dae1133..e9d78668fd 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -41,6 +41,13 @@ static bool __ro_after_init has_psci_system_suspend;
#define PSCI_RET(res) ((int32_t)(res).a0)
+#ifdef CONFIG_SYSTEM_SUSPEND
+bool psci_system_suspend_allowed(void)
+{
+ return has_psci_system_suspend;
+}
+#endif
+
int call_psci_cpu_on(int cpu)
{
struct arm_smccc_res res;
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index 6ea4a0f9cc..a571035d2c 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -1,9 +1,49 @@
/* SPDX-License-Identifier: GPL-2.0-only */
+#include <asm/psci.h>
#include <asm/suspend.h>
+#include <xen/lib.h>
+#include <xen/serial.h>
+
struct resume_cpu_context resume_cpu_context;
+/*
+ * Non-PSCI infrastructure can make host suspend impossible even when the PSCI
+ * SYSTEM_SUSPEND conduit is present, e.g. when a Xen-owned driver has no valid
+ * suspend/resume path.
+ *
+ * This gate is checked only when the last awake control domain attempts to
+ * turn a guest SYSTEM_SUSPEND request into a host-suspend request.
+ */
+static bool host_system_suspend_runtime_allowed = true;
+
+static bool host_serial_suspend_allowed(void)
+{
+ if ( serial_suspend_supported() )
+ return true;
+
+ printk_once(XENLOG_INFO
+ "Host SYSTEM_SUSPEND blocked: serial driver lacks suspend/resume support\n");
+
+ return false;
+}
+
+bool host_system_suspend_allowed(void)
+{
+ return psci_system_suspend_allowed() &&
+ host_serial_suspend_allowed() &&
+ host_system_suspend_runtime_allowed;
+}
+
+void host_system_suspend_disable(const char *reason)
+{
+ host_system_suspend_runtime_allowed = false;
+
+ printk(XENLOG_INFO "Host SYSTEM_SUSPEND blocked: %s\n",
+ reason ? reason : "unsupported suspend/resume path");
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index ac6af6118f..48a963ae3e 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -5,6 +5,7 @@
#include <asm/current.h>
#include <asm/domain.h>
+#include <asm/suspend.h>
#include <asm/vgic.h>
#include <asm/vpsci.h>
#include <asm/event.h>
@@ -219,6 +220,89 @@ static void do_psci_0_2_system_reset(void)
domain_shutdown(d,SHUTDOWN_reboot);
}
+/*
+ * Serialise SYSTEM_SUSPEND policy decisions with the domain suspend transition,
+ * so multiple control domains cannot all observe each other as still awake.
+ */
+static DEFINE_SPINLOCK(vpsci_system_suspend_lock);
+
+static bool domain_in_suspend_state(struct domain *d)
+{
+ bool suspended;
+
+ spin_lock(&d->shutdown_lock);
+ suspended = d->is_shut_down && d->shutdown_code == SHUTDOWN_suspend;
+ spin_unlock(&d->shutdown_lock);
+
+ return suspended;
+}
+
+static int32_t domain_psci_system_suspend_policy(struct domain *d)
+{
+ struct domain *other;
+ bool last_awake_control_domain = true;
+ bool awake_non_control_domain = false;
+
+ /* Only control domains participate in sequencing policy. */
+ if ( !is_control_domain(d) )
+ return 0;
+
+ rcu_read_lock(&domlist_read_lock);
+
+ for_each_domain ( other )
+ {
+ bool suspended;
+
+ if ( other == d )
+ continue;
+
+ suspended = domain_in_suspend_state(other);
+ if ( suspended )
+ continue;
+
+ if ( is_control_domain(other) )
+ {
+ last_awake_control_domain = false;
+ break;
+ }
+
+ awake_non_control_domain = true;
+ }
+
+ rcu_read_unlock(&domlist_read_lock);
+
+ /*
+ * Another control domain is still awake. This request is only the first
+ * phase of the sequencing: park this control domain and leave the host
+ * running. Host-wide suspend gates must not block this intermediate state.
+ */
+ if ( !last_awake_control_domain )
+ return 0;
+
+ /*
+ * This is the last awake control domain. It must not be parked unless the
+ * request can proceed as a host-suspend request; otherwise Xen would lose
+ * the last domain that can coordinate the system suspend.
+ */
+ if ( awake_non_control_domain )
+ {
+ printk(XENLOG_DEBUG
+ "SYSTEM_SUSPEND denied: last awake control domain dom%u requested host suspend while non-control domains are still awake\n",
+ d->domain_id);
+ return PSCI_DENIED;
+ }
+
+ /*
+ * Host-wide gates are relevant only for the last-control-domain case. They
+ * must not block parking of a non-last control domain, but they must reject
+ * the last control domain when host suspend is not available.
+ */
+ if ( !host_system_suspend_allowed() )
+ return PSCI_NOT_SUPPORTED;
+
+ return 0;
+}
+
static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
{
int32_t rc;
@@ -232,10 +316,6 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
if ( is_64bit_domain(d) && is_thumb )
return PSCI_INVALID_ADDRESS;
- /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
- if ( is_hardware_domain(d) )
- return PSCI_NOT_SUPPORTED;
-
/* Ensure that all CPUs other than the calling one are offline */
domain_lock(d);
for_each_vcpu ( d, v )
@@ -252,16 +332,29 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
if ( rc )
return PSCI_DENIED;
- rc = domain_shutdown(d, SHUTDOWN_suspend);
+ spin_lock(&vpsci_system_suspend_lock);
+
+ rc = domain_psci_system_suspend_policy(d);
+ if ( !rc )
+ {
+ rc = domain_shutdown(d, SHUTDOWN_suspend);
+ if ( rc )
+ rc = PSCI_DENIED;
+ else
+ {
+ rctx->ctxt = ctxt;
+ rctx->wake_cpu = current;
+ }
+ }
+
+ spin_unlock(&vpsci_system_suspend_lock);
+
if ( rc )
{
free_vcpu_guest_context(ctxt);
- return PSCI_DENIED;
+ return rc;
}
- rctx->ctxt = ctxt;
- rctx->wake_cpu = current;
-
gprintk(XENLOG_DEBUG,
"SYSTEM_SUSPEND requested, epoint=%#"PRIregister", cid=%#"PRIregister"\n",
epoint, cid);
@@ -287,10 +380,9 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
case PSCI_0_2_FN32_SYSTEM_RESET:
case PSCI_1_0_FN32_PSCI_FEATURES:
case ARM_SMCCC_VERSION_FID:
- return 0;
case PSCI_1_0_FN32_SYSTEM_SUSPEND:
case PSCI_1_0_FN64_SYSTEM_SUSPEND:
- return is_hardware_domain(current->domain) ? PSCI_NOT_SUPPORTED : 0;
+ return 0;
default:
return PSCI_NOT_SUPPORTED;
}
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 5ff71480ee..816a1a4ecb 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -140,6 +140,9 @@ config HAS_EX_TABLE
config HAS_FAST_MULTIPLY
bool
+config HAS_HWDOM_SYSTEM_SUSPEND
+ bool
+
config HAS_IOPORTS
bool
diff --git a/xen/common/domain.c b/xen/common/domain.c
index bb9e210c28..d3edfb2a13 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1375,6 +1375,11 @@ void __domain_crash(struct domain *d)
domain_shutdown(d, SHUTDOWN_crash);
}
+static inline bool want_hwdom_shutdown(uint8_t reason)
+{
+ return !IS_ENABLED(CONFIG_HAS_HWDOM_SYSTEM_SUSPEND) ||
+ reason != SHUTDOWN_suspend;
+}
int domain_shutdown(struct domain *d, u8 reason)
{
@@ -1391,7 +1396,7 @@ int domain_shutdown(struct domain *d, u8 reason)
d->shutdown_code = reason;
reason = d->shutdown_code;
- if ( is_hardware_domain(d) )
+ if ( is_hardware_domain(d) && want_hwdom_shutdown(reason) )
hwdom_shutdown(reason);
if ( d->is_shutting_down )
diff --git a/xen/drivers/char/serial.c b/xen/drivers/char/serial.c
index adb312d796..cc2b5b5dee 100644
--- a/xen/drivers/char/serial.c
+++ b/xen/drivers/char/serial.c
@@ -497,6 +497,8 @@ const struct vuart_info *serial_vuart_info(int idx)
#ifdef CONFIG_SYSTEM_SUSPEND
+static bool __read_mostly serial_suspend_available = true;
+
void serial_suspend(void)
{
int i;
@@ -513,6 +515,11 @@ void serial_resume(void)
com[i].driver->resume(&com[i]);
}
+bool serial_suspend_supported(void)
+{
+ return serial_suspend_available;
+}
+
#endif /* CONFIG_SYSTEM_SUSPEND */
void __init serial_register_uart(int idx, struct uart_driver *driver,
@@ -521,6 +528,11 @@ void __init serial_register_uart(int idx, struct uart_driver *driver,
/* Store UART-specific info. */
com[idx].driver = driver;
com[idx].uart = uart;
+
+#ifdef CONFIG_SYSTEM_SUSPEND
+ if ( !driver->suspend || !driver->resume )
+ serial_suspend_available = false;
+#endif
}
void __init serial_async_transmit(struct serial_port *port)
diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
index 100545e23f..420b9fbdf0 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -19,6 +19,7 @@
#include <xen/device_tree.h>
#include <xen/iommu.h>
#include <xen/lib.h>
+#include <xen/suspend.h>
#include <asm/device.h>
@@ -46,6 +47,9 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
}
iommu_ops = ops;
+
+ if ( !ops->suspend || !ops->resume )
+ host_system_suspend_disable("IOMMU driver lacks suspend/resume support");
}
int __init iommu_hardware_setup(void)
diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c
index 82c8ead979..443dd0e893 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -91,6 +91,7 @@
#include <asm/io.h>
#include <asm/iommu_fwspec.h>
#include <asm/platform.h>
+#include <asm/suspend.h>
#include "smmu-v3.h"
@@ -1903,6 +1904,9 @@ static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
}
}
+ host_system_suspend_disable(
+ "SMMUv3 MSI IRQ path is unsupported for host suspend");
+
/* Add callback to free MSIs on teardown */
devm_add_action(dev, arm_smmu_free_msis, dev);
}
diff --git a/xen/include/xen/serial.h b/xen/include/xen/serial.h
index 8e18445552..418b00ead0 100644
--- a/xen/include/xen/serial.h
+++ b/xen/include/xen/serial.h
@@ -137,6 +137,7 @@ const struct vuart_info* serial_vuart_info(int idx);
/* Serial suspend/resume. */
void serial_suspend(void);
void serial_resume(void);
+bool serial_suspend_supported(void);
#endif
/*
diff --git a/xen/include/xen/suspend.h b/xen/include/xen/suspend.h
index 6f94fd53b0..a941331035 100644
--- a/xen/include/xen/suspend.h
+++ b/xen/include/xen/suspend.h
@@ -6,6 +6,8 @@
#if __has_include(<asm/suspend.h>)
#include <asm/suspend.h>
#else
+struct domain;
+
static inline void arch_domain_resume(struct domain *d) {}
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy
2026-05-12 17:07 ` [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy Mykola Kvach
@ 2026-05-13 6:53 ` Jan Beulich
2026-05-14 14:51 ` Mykola Kvach
2026-05-15 8:17 ` Luca Fancellu
1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2026-05-13 6:53 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Roger Pau Monné, Rahul Singh, xen-devel
On 12.05.2026 19:07, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> Introduce CONFIG_HAS_HWDOM_SYSTEM_SUSPEND as an architecture-selected
> capability for platforms where the hardware domain can be parked with
> SHUTDOWN_suspend without calling hwdom_shutdown().
>
> Expose PSCI SYSTEM_SUSPEND as a vPSCI operation for all domains. For
> non-control domains, including the hardware domain when it is not acting as a
> control domain, the call is handled as a guest/domain suspend request and
> parks the domain in SHUTDOWN_suspend.
>
> Control domains need additional sequencing because their SYSTEM_SUSPEND
> request is used to coordinate host-wide suspend. A non-last awake control
> domain may be parked in SHUTDOWN_suspend without requiring the host suspend
> path to be available. The last awake control domain is treated as the point
> where the request becomes a host-suspend request, and it may only proceed
> when all non-control domains are already in SHUTDOWN_suspend and the host
> suspend path is available.
>
> Keep the control-domain sequencing and domain-readiness checks out of
> PSCI_FEATURES. They are per-attempt runtime conditions rather than stable PSCI
> function availability. Advertise SYSTEM_SUSPEND as implemented by vPSCI and
> enforce the sequencing policy in the call handler.
>
> Select HAS_HWDOM_SYSTEM_SUSPEND independently from CONFIG_SYSTEM_SUSPEND so
> that SHUTDOWN_suspend from the hardware domain can be treated as a domain
> suspend state rather than as a hardware-domain initiated host shutdown. This
> does not by itself imply that host-wide suspend is available.
>
> Add host_system_suspend_allowed() to combine the host PSCI SYSTEM_SUSPEND
> capability with runtime blockers reported by Xen-owned subsystems. Add
> runtime blockers for registered serial, IOMMU, GIC and SMMUv3 MSI IRQ paths
> lacking suspend/resume support. These blockers are runtime based, so they
> only apply to drivers or paths that Xen actually uses on the platform. For
> SMMUv3, the blocker applies only when Xen actually uses the MSI IRQ path,
> since resume does not restore the SMMU *_IRQ_CFGn MSI registers yet.
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> xen/arch/arm/Kconfig | 1 +
> xen/arch/arm/gic.c | 6 ++
> xen/arch/arm/include/asm/psci.h | 3 +
> xen/arch/arm/include/asm/suspend.h | 10 ++-
> xen/arch/arm/psci.c | 7 ++
> xen/arch/arm/suspend.c | 40 +++++++++
> xen/arch/arm/vpsci.c | 114 +++++++++++++++++++++++---
> xen/common/Kconfig | 3 +
> xen/common/domain.c | 7 +-
> xen/drivers/char/serial.c | 12 +++
> xen/drivers/passthrough/arm/iommu.c | 4 +
> xen/drivers/passthrough/arm/smmu-v3.c | 4 +
> xen/include/xen/serial.h | 1 +
> xen/include/xen/suspend.h | 2 +
> 14 files changed, 201 insertions(+), 13 deletions(-)
>
Contrary to what the cover letter says, there's no revlog here.
> --- a/xen/arch/arm/suspend.c
> +++ b/xen/arch/arm/suspend.c
> @@ -1,9 +1,49 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
>
> +#include <asm/psci.h>
> #include <asm/suspend.h>
>
> +#include <xen/lib.h>
> +#include <xen/serial.h>
> +
> struct resume_cpu_context resume_cpu_context;
>
> +/*
> + * Non-PSCI infrastructure can make host suspend impossible even when the PSCI
> + * SYSTEM_SUSPEND conduit is present, e.g. when a Xen-owned driver has no valid
> + * suspend/resume path.
> + *
> + * This gate is checked only when the last awake control domain attempts to
> + * turn a guest SYSTEM_SUSPEND request into a host-suspend request.
> + */
> +static bool host_system_suspend_runtime_allowed = true;
> +
> +static bool host_serial_suspend_allowed(void)
> +{
> + if ( serial_suspend_supported() )
> + return true;
> +
> + printk_once(XENLOG_INFO
> + "Host SYSTEM_SUSPEND blocked: serial driver lacks suspend/resume support\n");
Please try to keep log messages down to a reasonable size. In the case here,
what value does "suspend/resume" add?
> +static int32_t domain_psci_system_suspend_policy(struct domain *d)
> +{
> + struct domain *other;
> + bool last_awake_control_domain = true;
> + bool awake_non_control_domain = false;
> +
> + /* Only control domains participate in sequencing policy. */
> + if ( !is_control_domain(d) )
> + return 0;
> +
> + rcu_read_lock(&domlist_read_lock);
> +
> + for_each_domain ( other )
> + {
> + bool suspended;
> +
> + if ( other == d )
> + continue;
> +
> + suspended = domain_in_suspend_state(other);
> + if ( suspended )
> + continue;
> +
> + if ( is_control_domain(other) )
> + {
> + last_awake_control_domain = false;
> + break;
> + }
> +
> + awake_non_control_domain = true;
> + }
> +
> + rcu_read_unlock(&domlist_read_lock);
> +
> + /*
> + * Another control domain is still awake. This request is only the first
> + * phase of the sequencing: park this control domain and leave the host
> + * running. Host-wide suspend gates must not block this intermediate state.
> + */
> + if ( !last_awake_control_domain )
> + return 0;
> +
> + /*
> + * This is the last awake control domain. It must not be parked unless the
> + * request can proceed as a host-suspend request; otherwise Xen would lose
> + * the last domain that can coordinate the system suspend.
> + */
> + if ( awake_non_control_domain )
> + {
> + printk(XENLOG_DEBUG
> + "SYSTEM_SUSPEND denied: last awake control domain dom%u requested host suspend while non-control domains are still awake\n",
> + d->domain_id);
Same here, plus please use %pd.
> --- a/xen/drivers/char/serial.c
> +++ b/xen/drivers/char/serial.c
> @@ -497,6 +497,8 @@ const struct vuart_info *serial_vuart_info(int idx)
>
> #ifdef CONFIG_SYSTEM_SUSPEND
>
> +static bool __read_mostly serial_suspend_available = true;
__ro_after_init?
Jan
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy
2026-05-13 6:53 ` Jan Beulich
@ 2026-05-14 14:51 ` Mykola Kvach
0 siblings, 0 replies; 29+ messages in thread
From: Mykola Kvach @ 2026-05-14 14:51 UTC (permalink / raw)
To: Jan Beulich
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Roger Pau Monné, Rahul Singh, xen-devel
Hi Jan,
Thank you for the review.
On Wed, May 13, 2026 at 9:54 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 12.05.2026 19:07, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > Introduce CONFIG_HAS_HWDOM_SYSTEM_SUSPEND as an architecture-selected
> > capability for platforms where the hardware domain can be parked with
> > SHUTDOWN_suspend without calling hwdom_shutdown().
> >
> > Expose PSCI SYSTEM_SUSPEND as a vPSCI operation for all domains. For
> > non-control domains, including the hardware domain when it is not acting as a
> > control domain, the call is handled as a guest/domain suspend request and
> > parks the domain in SHUTDOWN_suspend.
> >
> > Control domains need additional sequencing because their SYSTEM_SUSPEND
> > request is used to coordinate host-wide suspend. A non-last awake control
> > domain may be parked in SHUTDOWN_suspend without requiring the host suspend
> > path to be available. The last awake control domain is treated as the point
> > where the request becomes a host-suspend request, and it may only proceed
> > when all non-control domains are already in SHUTDOWN_suspend and the host
> > suspend path is available.
> >
> > Keep the control-domain sequencing and domain-readiness checks out of
> > PSCI_FEATURES. They are per-attempt runtime conditions rather than stable PSCI
> > function availability. Advertise SYSTEM_SUSPEND as implemented by vPSCI and
> > enforce the sequencing policy in the call handler.
> >
> > Select HAS_HWDOM_SYSTEM_SUSPEND independently from CONFIG_SYSTEM_SUSPEND so
> > that SHUTDOWN_suspend from the hardware domain can be treated as a domain
> > suspend state rather than as a hardware-domain initiated host shutdown. This
> > does not by itself imply that host-wide suspend is available.
> >
> > Add host_system_suspend_allowed() to combine the host PSCI SYSTEM_SUSPEND
> > capability with runtime blockers reported by Xen-owned subsystems. Add
> > runtime blockers for registered serial, IOMMU, GIC and SMMUv3 MSI IRQ paths
> > lacking suspend/resume support. These blockers are runtime based, so they
> > only apply to drivers or paths that Xen actually uses on the platform. For
> > SMMUv3, the blocker applies only when Xen actually uses the MSI IRQ path,
> > since resume does not restore the SMMU *_IRQ_CFGn MSI registers yet.
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > xen/arch/arm/Kconfig | 1 +
> > xen/arch/arm/gic.c | 6 ++
> > xen/arch/arm/include/asm/psci.h | 3 +
> > xen/arch/arm/include/asm/suspend.h | 10 ++-
> > xen/arch/arm/psci.c | 7 ++
> > xen/arch/arm/suspend.c | 40 +++++++++
> > xen/arch/arm/vpsci.c | 114 +++++++++++++++++++++++---
> > xen/common/Kconfig | 3 +
> > xen/common/domain.c | 7 +-
> > xen/drivers/char/serial.c | 12 +++
> > xen/drivers/passthrough/arm/iommu.c | 4 +
> > xen/drivers/passthrough/arm/smmu-v3.c | 4 +
> > xen/include/xen/serial.h | 1 +
> > xen/include/xen/suspend.h | 2 +
> > 14 files changed, 201 insertions(+), 13 deletions(-)
> >
>
> Contrary to what the cover letter says, there's no revlog here.
Right, I should have added a short note for this patch as well.
This is a new patch in this version of the series, so there was no previous
version of this particular patch to compare against, but I agree that this
should still have been mentioned explicitly, e.g. as "New in v9". I will add
that in the next version.
>
> > --- a/xen/arch/arm/suspend.c
> > +++ b/xen/arch/arm/suspend.c
> > @@ -1,9 +1,49 @@
> > /* SPDX-License-Identifier: GPL-2.0-only */
> >
> > +#include <asm/psci.h>
> > #include <asm/suspend.h>
> >
> > +#include <xen/lib.h>
> > +#include <xen/serial.h>
> > +
> > struct resume_cpu_context resume_cpu_context;
> >
> > +/*
> > + * Non-PSCI infrastructure can make host suspend impossible even when the PSCI
> > + * SYSTEM_SUSPEND conduit is present, e.g. when a Xen-owned driver has no valid
> > + * suspend/resume path.
> > + *
> > + * This gate is checked only when the last awake control domain attempts to
> > + * turn a guest SYSTEM_SUSPEND request into a host-suspend request.
> > + */
> > +static bool host_system_suspend_runtime_allowed = true;
> > +
> > +static bool host_serial_suspend_allowed(void)
> > +{
> > + if ( serial_suspend_supported() )
> > + return true;
> > +
> > + printk_once(XENLOG_INFO
> > + "Host SYSTEM_SUSPEND blocked: serial driver lacks suspend/resume support\n");
>
> Please try to keep log messages down to a reasonable size. In the case here,
> what value does "suspend/resume" add?
Fair point. The important part is that the serial driver cannot support the
host suspend path. I will shorten the message somehow.
>
> > +static int32_t domain_psci_system_suspend_policy(struct domain *d)
> > +{
> > + struct domain *other;
> > + bool last_awake_control_domain = true;
> > + bool awake_non_control_domain = false;
> > +
> > + /* Only control domains participate in sequencing policy. */
> > + if ( !is_control_domain(d) )
> > + return 0;
> > +
> > + rcu_read_lock(&domlist_read_lock);
> > +
> > + for_each_domain ( other )
> > + {
> > + bool suspended;
> > +
> > + if ( other == d )
> > + continue;
> > +
> > + suspended = domain_in_suspend_state(other);
> > + if ( suspended )
> > + continue;
> > +
> > + if ( is_control_domain(other) )
> > + {
> > + last_awake_control_domain = false;
> > + break;
> > + }
> > +
> > + awake_non_control_domain = true;
> > + }
> > +
> > + rcu_read_unlock(&domlist_read_lock);
> > +
> > + /*
> > + * Another control domain is still awake. This request is only the first
> > + * phase of the sequencing: park this control domain and leave the host
> > + * running. Host-wide suspend gates must not block this intermediate state.
> > + */
> > + if ( !last_awake_control_domain )
> > + return 0;
> > +
> > + /*
> > + * This is the last awake control domain. It must not be parked unless the
> > + * request can proceed as a host-suspend request; otherwise Xen would lose
> > + * the last domain that can coordinate the system suspend.
> > + */
> > + if ( awake_non_control_domain )
> > + {
> > + printk(XENLOG_DEBUG
> > + "SYSTEM_SUSPEND denied: last awake control domain dom%u requested host suspend while non-control domains are still awake\n",
> > + d->domain_id);
>
> Same here, plus please use %pd.
Ack, I will shorten the message and use %pd.
>
> > --- a/xen/drivers/char/serial.c
> > +++ b/xen/drivers/char/serial.c
> > @@ -497,6 +497,8 @@ const struct vuart_info *serial_vuart_info(int idx)
> >
> > #ifdef CONFIG_SYSTEM_SUSPEND
> >
> > +static bool __read_mostly serial_suspend_available = true;
>
> __ro_after_init?
Good catch, yes.
I will use __ro_after_init for this variable.
Best regards,
Mykola
>
> Jan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy
2026-05-12 17:07 ` [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy Mykola Kvach
2026-05-13 6:53 ` Jan Beulich
@ 2026-05-15 8:17 ` Luca Fancellu
1 sibling, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-15 8:17 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné,
Rahul Singh
Hi Mykola,
apart from Jan’s findings this looks ok to me
> diff --git a/xen/include/xen/suspend.h b/xen/include/xen/suspend.h
> index 6f94fd53b0..a941331035 100644
> --- a/xen/include/xen/suspend.h
> +++ b/xen/include/xen/suspend.h
> @@ -6,6 +6,8 @@
> #if __has_include(<asm/suspend.h>)
> #include <asm/suspend.h>
> #else
> +struct domain;
> +
Just this one isn’t mentioned in the commit message.
With Jan’s point and above fixed:
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
> static inline void arch_domain_resume(struct domain *d) {}
> #endif
>
>
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v9 13/13] xen/arm: Add host system suspend backend
2026-05-12 17:07 [PATCH v9 00/13] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
` (11 preceding siblings ...)
2026-05-12 17:07 ` [PATCH v9 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy Mykola Kvach
@ 2026-05-12 17:07 ` Mykola Kvach
2026-05-15 8:44 ` Luca Fancellu
12 siblings, 1 reply; 29+ messages in thread
From: Mykola Kvach @ 2026-05-12 17:07 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
From: Mirela Simonovic <mirela.simonovic@aggios.com>
Add the Xen-wide suspend/resume backend used after a control-domain
vPSCI SYSTEM_SUSPEND request has been accepted. The vPSCI policy,
runtime driver blockers and control-domain sequencing checks are handled
by the preceding commit; this change adds the code that actually drives
the host suspend attempt.
The backend runs from a tasklet scheduled on pCPU0, because non-boot CPUs
are disabled during suspend. It freezes domains, disables the scheduler
and then disables non-boot CPUs.
Host-side suspend participants are handled in phases. IOMMU and console
state are suspended first. Local IRQs are then disabled before suspending
timer and GIC state. On resume or failure, the completed suspend phases
are unwound in reverse: GIC and timer state are restored while IRQs are
still disabled, local IRQs are restored, and then console and IOMMU state
are restored.
On boot, init_ttbr is normally initialized during secondary CPU hotplug.
On uniprocessor systems this can leave init_ttbr uninitialized, so set it
from the boot CPU before entering suspend.
Note: the code is behind CONFIG_HAS_SYSTEM_SUSPEND, which is currently
only selected when UNSUPPORTED is set and MPU is not set.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V9:
- Split vPSCI availability policy, runtime host-suspend blockers and the
domain-readiness precheck into the preceding commit.
- Trigger the host suspend backend from the control-domain SYSTEM_SUSPEND
path.
- Reorder the host suspend/resume phases so the timer is suspended with
local IRQs disabled and local IRQs are restored after the GIC and timer
resume paths, before the console and IOMMU resume paths.
- Move HAS_HWDOM_SYSTEM_SUSPEND and related logic to policy patch.
Changes in V8:
- Add a pre-suspend check in system_suspend() after scheduler_disable() to
require all domains to be in the shut down state with SHUTDOWN_suspend
before proceeding with the global suspend flow.
- Drop the common-level depends on !ARM_64 || !SYSTEM_SUSPEND from
CONFIG_HAS_HWDOM_SHUTDOWN_ON_SUSPEND and model the ARM64 suspend case
with an arch-selected capability instead.
- Rename CONFIG_HAS_HWDOM_SHUTDOWN_ON_SUSPEND to
CONFIG_HAS_HWDOM_SYSTEM_SUSPEND.
- Rename need_hwdom_shutdown() to want_hwdom_shutdown().
Changes in V7:
- Control domain is responsible for host suspend.
- Add an empty inline host_system_suspend() function when SYSTEM_SUSPEND
config is disabled.
- Use IS_ENABLED() for config checking instead of #ifdef.
- Replace #ifdef checks in domain_shutdown() with IS_ENABLED() to simplify
control flow.
- Factor hardware domain shutdown condition into a helper
(need_hwdom_shutdown()) to avoid preprocessor directives inside the
function.
- Squash with iommu suspend/resume commit.
---
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/include/asm/mm.h | 2 +
xen/arch/arm/include/asm/suspend.h | 2 +
xen/arch/arm/mmu/smpboot.c | 2 +-
xen/arch/arm/suspend.c | 140 +++++++++++++++++++++++++++++
xen/arch/arm/vpsci.c | 10 ++-
6 files changed, 154 insertions(+), 3 deletions(-)
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 54a5bfb9ae..119bc00674 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -9,6 +9,7 @@ config ARM_64
select 64BIT
select HAS_DOMAIN_TYPE
select HAS_FAST_MULTIPLY
+ select HAS_SYSTEM_SUSPEND if !MPU && UNSUPPORTED
select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH
config ARM
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 2eb8465aa9..de119cad3a 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -360,6 +360,8 @@ static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
} while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
}
+void set_init_ttbr(lpae_t *root);
+
#endif /* __ARCH_ARM_MM__ */
/*
* Local variables:
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 87db12eac3..a194dbb21a 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -40,11 +40,13 @@ int prepare_resume_ctx(void);
void hyp_resume(void);
bool host_system_suspend_allowed(void);
void host_system_suspend_disable(const char *reason);
+void host_system_suspend(struct domain *d);
#else /* !CONFIG_SYSTEM_SUSPEND */
static inline bool host_system_suspend_allowed(void) { return false; }
static inline void host_system_suspend_disable(const char *reason) {}
+static inline void host_system_suspend(struct domain *d) {}
#endif
diff --git a/xen/arch/arm/mmu/smpboot.c b/xen/arch/arm/mmu/smpboot.c
index 37e91d72b7..ff508ecf40 100644
--- a/xen/arch/arm/mmu/smpboot.c
+++ b/xen/arch/arm/mmu/smpboot.c
@@ -72,7 +72,7 @@ static void clear_boot_pagetables(void)
clear_table(boot_third);
}
-static void set_init_ttbr(lpae_t *root)
+void set_init_ttbr(lpae_t *root)
{
/*
* init_ttbr is part of the identity mapping which is read-only. So
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index a571035d2c..b1cc67fbdb 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -1,10 +1,16 @@
/* SPDX-License-Identifier: GPL-2.0-only */
+#include <asm/gic.h>
#include <asm/psci.h>
#include <asm/suspend.h>
+#include <xen/console.h>
+#include <xen/cpu.h>
+#include <xen/iommu.h>
#include <xen/lib.h>
+#include <xen/sched.h>
#include <xen/serial.h>
+#include <xen/tasklet.h>
struct resume_cpu_context resume_cpu_context;
@@ -44,6 +50,140 @@ void host_system_suspend_disable(const char *reason)
reason ? reason : "unsupported suspend/resume path");
}
+/* Xen suspend. data identifies the domain that initiated suspend. */
+static void system_suspend(void *data)
+{
+ int status;
+ unsigned long flags;
+ struct domain *d = (struct domain *)data;
+
+ BUG_ON(system_state != SYS_STATE_active);
+
+ system_state = SYS_STATE_suspend;
+
+ printk("Xen suspending...\n");
+
+ freeze_domains();
+ scheduler_disable();
+
+ /*
+ * Non-boot CPUs have to be disabled on suspend and enabled on resume
+ * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
+ * CPU_OFF to be called by each non-boot CPU. Depending on the underlying
+ * platform capabilities, this may lead to the physical powering down of
+ * CPUs.
+ */
+ status = disable_nonboot_cpus();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_nonboot_cpus;
+ }
+
+ console_start_sync();
+ status = iommu_suspend();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_end_sync;
+ }
+
+ status = console_suspend();
+ if ( status )
+ {
+ dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", status);
+ system_state = SYS_STATE_resume;
+ goto resume_iommu;
+ }
+
+ local_irq_save(flags);
+
+ time_suspend();
+
+ status = gic_suspend();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_time;
+ }
+
+ set_init_ttbr(xen_pgtable);
+
+ /*
+ * Enable identity mapping before entering suspend to simplify
+ * the resume path
+ */
+ update_boot_mapping(true);
+
+ if ( prepare_resume_ctx() )
+ {
+ status = call_psci_system_suspend();
+ /*
+ * If suspend is finalized properly by above system suspend PSCI call,
+ * the code below in this 'if' branch will never execute. Execution
+ * will continue from hyp_resume which is the hypervisor's resume point.
+ * In hyp_resume CPU context will be restored and since link-register is
+ * restored as well, it will appear to return from prepare_resume_ctx.
+ * The difference in returning from prepare_resume_ctx on system suspend
+ * versus resume is in function's return value: on suspend, the return
+ * value is a non-zero value, on resume it is zero. That is why the
+ * control flow will not re-enter this 'if' branch on resume.
+ */
+ if ( status )
+ dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n",
+ status);
+ }
+
+ system_state = SYS_STATE_resume;
+ update_boot_mapping(false);
+
+ gic_resume();
+
+ resume_time:
+ time_resume();
+
+ local_irq_restore(flags);
+
+ console_resume();
+
+ resume_iommu:
+ iommu_resume();
+
+ resume_end_sync:
+ console_end_sync();
+
+ resume_nonboot_cpus:
+ /*
+ * The rcu_barrier() has to be added to ensure that the per cpu area is
+ * freed before a non-boot CPU tries to initialize it (_free_percpu_area()
+ * has to be called before the init_percpu_area()). This scenario occurs
+ * when non-boot CPUs are hot-unplugged on suspend and hotplugged on resume.
+ */
+ rcu_barrier();
+ enable_nonboot_cpus();
+
+ scheduler_enable();
+ thaw_domains();
+
+ system_state = SYS_STATE_active;
+
+ printk("Resume (status %d)\n", status);
+
+ domain_resume(d);
+}
+
+static DECLARE_TASKLET(system_suspend_tasklet, system_suspend, NULL);
+
+void host_system_suspend(struct domain *d)
+{
+ system_suspend_tasklet.data = (void *)d;
+ /*
+ * The suspend procedure has to be finalized by the pCPU#0 (non-boot pCPUs
+ * will be disabled during the suspend).
+ */
+ tasklet_schedule_on_cpu(&system_suspend_tasklet, 0);
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index 48a963ae3e..79100cbb1e 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -237,7 +237,8 @@ static bool domain_in_suspend_state(struct domain *d)
return suspended;
}
-static int32_t domain_psci_system_suspend_policy(struct domain *d)
+static int32_t domain_psci_system_suspend_policy(struct domain *d,
+ bool *host_suspend)
{
struct domain *other;
bool last_awake_control_domain = true;
@@ -300,6 +301,7 @@ static int32_t domain_psci_system_suspend_policy(struct domain *d)
if ( !host_system_suspend_allowed() )
return PSCI_NOT_SUPPORTED;
+ *host_suspend = true;
return 0;
}
@@ -310,6 +312,7 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
struct vcpu *v;
struct domain *d = current->domain;
bool is_thumb = epoint & 1;
+ bool host_suspend = false;
struct resume_info *rctx = &d->arch.resume_ctx;
/* THUMB set is not allowed with 64-bit domain */
@@ -334,7 +337,7 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
spin_lock(&vpsci_system_suspend_lock);
- rc = domain_psci_system_suspend_policy(d);
+ rc = domain_psci_system_suspend_policy(d, &host_suspend);
if ( !rc )
{
rc = domain_shutdown(d, SHUTDOWN_suspend);
@@ -359,6 +362,9 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
"SYSTEM_SUSPEND requested, epoint=%#"PRIregister", cid=%#"PRIregister"\n",
epoint, cid);
+ if ( host_suspend )
+ host_system_suspend(d);
+
return rc;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v9 13/13] xen/arm: Add host system suspend backend
2026-05-12 17:07 ` [PATCH v9 13/13] xen/arm: Add host system suspend backend Mykola Kvach
@ 2026-05-15 8:44 ` Luca Fancellu
0 siblings, 0 replies; 29+ messages in thread
From: Luca Fancellu @ 2026-05-15 8:44 UTC (permalink / raw)
To: Mykola Kvach
Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Hi Mykola,
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 2eb8465aa9..de119cad3a 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -360,6 +360,8 @@ static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
> } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
> }
>
> +void set_init_ttbr(lpae_t *root);
Since this is MMU only, shall we move it to asm/mmu/mm.h?
with that fixed:
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Cheers,
Luca
^ permalink raw reply [flat|nested] 29+ messages in thread