All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64
@ 2025-08-11 20:47 Mykola Kvach
  2025-08-11 20:47 ` [PATCH v5 01/12] xen/arm: Add suspend and resume timer helpers Mykola Kvach
                   ` (12 more replies)
  0 siblings, 13 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
	Jan Beulich, Roger Pau Monné

From: Mykola Kvach <mykola_kvach@epam.com>

This is part 2 of version 5 of the ARM Xen system suspend/resume patch
series, based on earlier work by Mirela Simonovic and Mykyta Poturai.
This version is ported to Xen master (4.21-unstable) and includes
extensive improvements based on reviewer feedback. The patch series
restructures code to improve robustness, maintainability, and implements
system Suspend-to-RAM support on ARM64 hardware domains.

At a high-level, this patch series provides:
 - Support for Host system suspend/resume via PSCI SYSTEM_SUSPEND (ARM64)
 - Suspend/resume infrastructure for CPU context, timers, GICv2/GICv3 and IPMMU-VMSA
 - Proper error propagation and recovery throughout the suspend/resume flow

Key updates in this series:
 - Introduced architecture-specific suspend/resume infrastructure (new `suspend.c`, `suspend.h`, low-level context save/restore in `head.S`)
 - Integrated GICv2/GICv3 suspend and resume, including memory-backed context save/restore with error handling
 - Added time and IRQ suspend/resume hooks, ensuring correct timer/interrupt state across suspend cycles
 - Implemented proper PSCI SYSTEM_SUSPEND invocation and version checks
 - Improved state management and recovery in error cases during suspend/resume
 - Added support for IPMMU-VMSA context save/restore

---
TODOs:
 - Test system suspend with llc_coloring_enabled set and verify functionality
 - Implement SMMUv3 suspend/resume handlers
 - Enable "xl suspend" support on ARM
 - Properly disable Xen timer watchdog from relevant services (only init.d left)
 - Add suspend/resume CI test for ARM (QEMU if feasible)
 - Investigate feasibility and need for implementing system suspend on ARM32
---
Changes introduced in V5:
 - Add support for IPMMU-VMSA context save/restore
 - Add support for GICv3 context save/restore
 - Select HAS_SYSTEM_SUSPEND in ARM_64 instead of ARM
 - Check llc_coloring_enabled instead of LLC_COLORING during the selection
   of HAS_SYSTEM_SUSPEND config
 - Call host_system_suspend from guest PSCI system suspend instead of
   arch_domain_shutdown, reducing the complexity of the new code

Changes introduced in V4:
  - Remove the prior tasklet-based workaround in favor of a more
    straightforward and safer solution.
  - Rework the approach by adding explicit system state checks around
    request_irq and release_irq calls; skip these calls during suspend
    and resume states to avoid unsafe memory operations when IRQs are
    disabled.
  - Prevent reinitialization of local IRQ descriptors on system resume.
  - Restore the state of local IRQs during system resume for secondary CPUs.
  - Drop code for saving and restoring VCPU context (see part 1 of the patch
    series for details).
  - Remove IOMMU suspend and resume calls until these features are implemented.
  - Move system suspend logic to arch_domain_shutdown, invoked from
    domain_shutdown.
  - Add console_end_sync to the resume path after system suspend.
  - Drop unnecessary DAIF masking; interrupts are already masked on resume.
  - Remove leftover TLB flush instructions; flushing is handled in enable_mmu.
  - Avoid setting x19 in hyp_resume as it is not required.
  - Replace prepare_secondary_mm with set_init_ttbr, and call it from system_suspend.
  - Produce a build-time error for ARM32 when CONFIG_SYSTEM_SUSPEND is enabled.
  - Use register_t instead of uint64_t in the cpu_context structure.
  - Apply minor fixes such as renaming functions, updating comments, and
    modifying commit messages to accurately reflect the changes introduced
    by this patch series.

For earlier changelogs, please refer to the previous cover letters.

Previous versions:
  V1: https://marc.info/?l=xen-devel&m=154202231501850&w=2
  V2: https://marc.info/?l=xen-devel&m=166514782207736&w=2
  V3: https://lists.xen.org/archives/html/xen-devel/2025-03/msg00168.html

Mirela Simonovic (6):
  xen/arm: Add suspend and resume timer helpers
  xen/arm: gic-v2: Implement GIC suspend/resume functions
  xen/arm: Add support for system suspend triggered by hardware domain
  xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface)
  xen/arm: Resume memory management on Xen resume
  xen/arm: Save/restore context on suspend/resume

Mykola Kvach (4):
  xen/arm: gic-v3: Implement GICv3 suspend/resume functions
  xen/arm: Prevent crash during disable_nonboot_cpus on suspend
  xen/arm: irq: avoid local IRQ descriptors reinit on system resume
  xen/arm: irq: Restore state of local IRQs during system resume

Oleksandr Tyshchenko (2):
  iommu/ipmmu-vmsa: Implement suspend/resume callbacks
  xen/arm: Suspend/resume IOMMU on Xen suspend/resume

 xen/arch/arm/Kconfig                     |   1 +
 xen/arch/arm/Makefile                    |   1 +
 xen/arch/arm/arm64/head.S                | 113 ++++++++++
 xen/arch/arm/gic-v2.c                    | 154 +++++++++++++
 xen/arch/arm/gic-v3.c                    | 233 ++++++++++++++++++++
 xen/arch/arm/gic.c                       |  35 +++
 xen/arch/arm/include/asm/gic.h           |  12 +
 xen/arch/arm/include/asm/mm.h            |   2 +
 xen/arch/arm/include/asm/psci.h          |   1 +
 xen/arch/arm/include/asm/suspend.h       |  44 ++++
 xen/arch/arm/include/asm/time.h          |   5 +
 xen/arch/arm/irq.c                       |  48 ++++
 xen/arch/arm/mmu/smpboot.c               |   2 +-
 xen/arch/arm/psci.c                      |  23 +-
 xen/arch/arm/suspend.c                   | 196 +++++++++++++++++
 xen/arch/arm/tee/ffa_notif.c             |   2 +-
 xen/arch/arm/time.c                      |  56 ++++-
 xen/arch/arm/vpsci.c                     |  12 +-
 xen/common/domain.c                      |   4 +
 xen/drivers/passthrough/arm/ipmmu-vmsa.c | 269 +++++++++++++++++++++++
 20 files changed, 1197 insertions(+), 16 deletions(-)
 create mode 100644 xen/arch/arm/include/asm/suspend.h
 create mode 100644 xen/arch/arm/suspend.c

-- 
2.48.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v5 01/12] xen/arm: Add suspend and resume timer helpers
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
@ 2025-08-11 20:47 ` Mykola Kvach
  2025-08-11 20:47 ` [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
	Mykola Kvach

From: Mirela Simonovic <mirela.simonovic@aggios.com>

Timer interrupts must be disabled while the system is suspended to prevent
spurious wake-ups. Suspending the timers involves disabling both the EL1
physical timer and the EL2 hypervisor timer. Resuming consists of raising
the TIMER_SOFTIRQ, which prompts the generic timer code to reprogram the
EL2 timer as needed. Re-enabling of the EL1 timer is left to the entity
that uses it.

Introduce a new helper, disable_physical_timers, to encapsulate disabling
of the physical timers.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V4:
  - Rephrased comment and commit message for better clarity
  - Created separate function for disabling physical timers

Changes in V3:
  - time_suspend and time_resume are now conditionally compiled
    under CONFIG_SYSTEM_SUSPEND
---
 xen/arch/arm/include/asm/time.h |  5 +++++
 xen/arch/arm/time.c             | 38 +++++++++++++++++++++++++++------
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/include/asm/time.h b/xen/arch/arm/include/asm/time.h
index 49ad8c1a6d..f4fd0c6af5 100644
--- a/xen/arch/arm/include/asm/time.h
+++ b/xen/arch/arm/include/asm/time.h
@@ -108,6 +108,11 @@ void preinit_xen_time(void);
 
 void force_update_vcpu_system_time(struct vcpu *v);
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+void time_suspend(void);
+void time_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 #endif /* __ARM_TIME_H__ */
 /*
  * Local variables:
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index e74d30d258..ad984fdfdd 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -303,6 +303,14 @@ static void check_timer_irq_cfg(unsigned int irq, const char *which)
            "WARNING: %s-timer IRQ%u is not level triggered.\n", which, irq);
 }
 
+/* Disable physical timers for EL1 and EL2 on the current CPU */
+static inline void disable_physical_timers(void)
+{
+    WRITE_SYSREG(0, CNTP_CTL_EL0);    /* Physical timer disabled */
+    WRITE_SYSREG(0, CNTHP_CTL_EL2);   /* Hypervisor's timer disabled */
+    isb();
+}
+
 /* Set up the timer interrupt on this CPU */
 void init_timer_interrupt(void)
 {
@@ -310,9 +318,7 @@ void init_timer_interrupt(void)
     WRITE_SYSREG64(0, CNTVOFF_EL2);     /* No VM-specific offset */
     /* Do not let the VMs program the physical timer, only read the physical counter */
     WRITE_SYSREG(CNTHCTL_EL2_EL1PCTEN, CNTHCTL_EL2);
-    WRITE_SYSREG(0, CNTP_CTL_EL0);    /* Physical timer disabled */
-    WRITE_SYSREG(0, CNTHP_CTL_EL2);   /* Hypervisor's timer disabled */
-    isb();
+    disable_physical_timers();
 
     request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
                 "hyptimer", NULL);
@@ -330,9 +336,7 @@ void init_timer_interrupt(void)
  */
 static void deinit_timer_interrupt(void)
 {
-    WRITE_SYSREG(0, CNTP_CTL_EL0);    /* Disable physical timer */
-    WRITE_SYSREG(0, CNTHP_CTL_EL2);   /* Disable hypervisor's timer */
-    isb();
+    disable_physical_timers();
 
     release_irq(timer_irq[TIMER_HYP_PPI], NULL);
     release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
@@ -372,6 +376,28 @@ void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds)
     /* XXX update guest visible wallclock time */
 }
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+void time_suspend(void)
+{
+    disable_physical_timers();
+}
+
+void time_resume(void)
+{
+    /*
+     * Raising the timer softirq triggers generic code to call reprogram_timer
+     * with the correct timeout (not known here).
+     *
+     * No further action is needed to restore timekeeping after power down,
+     * since the system counter is unaffected. See ARM DDI 0487 L.a, D12.1.2
+     * "The system counter must be implemented in an always-on power domain."
+     */
+    raise_softirq(TIMER_SOFTIRQ);
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 static int cpu_time_callback(struct notifier_block *nfb,
                              unsigned long action,
                              void *hcpu)
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
  2025-08-11 20:47 ` [PATCH v5 01/12] xen/arm: Add suspend and resume timer helpers Mykola Kvach
@ 2025-08-11 20:47 ` Mykola Kvach
  2025-08-23  0:01   ` Volodymyr Babchuk
  2025-08-11 20:47 ` [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 " Mykola Kvach
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
	Mykyta Poturai, Mykola Kvach

From: Mirela Simonovic <mirela.simonovic@aggios.com>

System suspend may lead to a state where GIC would be powered down.
Therefore, Xen should save/restore the context of GIC on suspend/resume.

Note that the context consists of states of registers which are
controlled by the hypervisor. Other GIC registers which are accessible
by guests are saved/restored on context switch.

Tested on Xilinx Ultrascale+ MPSoC with (and without) powering down
the GIC.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v4:
  - Add error logging for allocation failures

Changes in v3:
  - Drop asserts and return error codes instead.
  - Wrap code with CONFIG_SYSTEM_SUSPEND.

Changes in v2:
  - Minor fixes after review.
---
 xen/arch/arm/gic-v2.c          | 154 +++++++++++++++++++++++++++++++++
 xen/arch/arm/gic.c             |  29 +++++++
 xen/arch/arm/include/asm/gic.h |  12 +++
 3 files changed, 195 insertions(+)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index b23e72a3d0..dce8f5e924 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1098,6 +1098,151 @@ static int gicv2_iomem_deny_access(struct domain *d)
     return iomem_deny_access(d, mfn, mfn + nr);
 }
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+/* GICv2 registers to be saved/restored on system suspend/resume */
+struct gicv2_context {
+    /* GICC context */
+    uint32_t gicc_ctlr;
+    uint32_t gicc_pmr;
+    uint32_t gicc_bpr;
+    /* GICD context */
+    uint32_t gicd_ctlr;
+    uint32_t *gicd_isenabler;
+    uint32_t *gicd_isactiver;
+    uint32_t *gicd_ipriorityr;
+    uint32_t *gicd_itargetsr;
+    uint32_t *gicd_icfgr;
+};
+
+static struct gicv2_context gicv2_context;
+
+static int gicv2_suspend(void)
+{
+    unsigned int i;
+
+    if ( !gicv2_context.gicd_isenabler )
+    {
+        dprintk(XENLOG_WARNING, "%s:%d: GICv2 suspend context not allocated!\n",
+            __func__, __LINE__);
+        return -ENOMEM;
+    }
+
+    /* Save GICC configuration */
+    gicv2_context.gicc_ctlr = readl_gicc(GICC_CTLR);
+    gicv2_context.gicc_pmr = readl_gicc(GICC_PMR);
+    gicv2_context.gicc_bpr = readl_gicc(GICC_BPR);
+
+    /* Save GICD configuration */
+    gicv2_context.gicd_ctlr = readl_gicd(GICD_CTLR);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
+        gicv2_context.gicd_isenabler[i] = readl_gicd(GICD_ISENABLER + i * 4);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
+        gicv2_context.gicd_isactiver[i] = readl_gicd(GICD_ISACTIVER + i * 4);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+        gicv2_context.gicd_ipriorityr[i] = readl_gicd(GICD_IPRIORITYR + i * 4);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+        gicv2_context.gicd_itargetsr[i] = readl_gicd(GICD_ITARGETSR + i * 4);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
+        gicv2_context.gicd_icfgr[i] = readl_gicd(GICD_ICFGR + i * 4);
+
+    return 0;
+}
+
+static void gicv2_resume(void)
+{
+    unsigned int i;
+
+    if ( !gicv2_context.gicd_isenabler )
+    {
+        dprintk(XENLOG_WARNING, "%s:%d: GICv2 suspend context not allocated!\n",
+            __func__, __LINE__);
+        return;
+    }
+
+    gicv2_cpu_disable();
+    /* Disable distributor */
+    writel_gicd(0, GICD_CTLR);
+
+    /* Restore GICD configuration */
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
+        writel_gicd(0xffffffff, GICD_ICENABLER + i * 4);
+        writel_gicd(gicv2_context.gicd_isenabler[i], GICD_ISENABLER + i * 4);
+    }
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
+        writel_gicd(0xffffffff, GICD_ICACTIVER + i * 4);
+        writel_gicd(gicv2_context.gicd_isactiver[i], GICD_ISACTIVER + i * 4);
+    }
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+        writel_gicd(gicv2_context.gicd_ipriorityr[i], GICD_IPRIORITYR + i * 4);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+        writel_gicd(gicv2_context.gicd_itargetsr[i], GICD_ITARGETSR + i * 4);
+
+    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
+        writel_gicd(gicv2_context.gicd_icfgr[i], GICD_ICFGR + i * 4);
+
+    /* Make sure all registers are restored and enable distributor */
+    writel_gicd(gicv2_context.gicd_ctlr | GICD_CTL_ENABLE, GICD_CTLR);
+
+    /* Restore GIC CPU interface configuration */
+    writel_gicc(gicv2_context.gicc_pmr, GICC_PMR);
+    writel_gicc(gicv2_context.gicc_bpr, GICC_BPR);
+
+    /* Enable GIC CPU interface */
+    writel_gicc(gicv2_context.gicc_ctlr | GICC_CTL_ENABLE | GICC_CTL_EOI,
+                GICC_CTLR);
+}
+
+static void gicv2_alloc_context(struct gicv2_context *gc)
+{
+    uint32_t n = gicv2_info.nr_lines;
+
+    gc->gicd_isenabler = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
+    if ( !gc->gicd_isenabler )
+        goto err_free;
+
+    gc->gicd_isactiver = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
+    if ( !gc->gicd_isactiver )
+        goto err_free;
+
+    gc->gicd_itargetsr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
+    if ( !gc->gicd_itargetsr )
+        goto err_free;
+
+    gc->gicd_ipriorityr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
+    if ( !gc->gicd_ipriorityr )
+        goto err_free;
+
+    gc->gicd_icfgr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 16));
+    if ( !gc->gicd_icfgr )
+        goto err_free;
+
+    return;
+
+ err_free:
+    dprintk(XENLOG_ERR,
+            "%s:%d: failed to allocate memory for GICv2 suspend context\n",
+            __func__, __LINE__);
+
+    xfree(gc->gicd_icfgr);
+    xfree(gc->gicd_ipriorityr);
+    xfree(gc->gicd_itargetsr);
+    xfree(gc->gicd_isactiver);
+    xfree(gc->gicd_isenabler);
+
+    memset(gc, 0, sizeof(*gc));
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 #ifdef CONFIG_ACPI
 static unsigned long gicv2_get_hwdom_extra_madt_size(const struct domain *d)
 {
@@ -1302,6 +1447,11 @@ static int __init gicv2_init(void)
 
     spin_unlock(&gicv2.lock);
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+    /* Allocate memory to be used for saving GIC context during the suspend */
+    gicv2_alloc_context(&gicv2_context);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
     return 0;
 }
 
@@ -1345,6 +1495,10 @@ static const struct gic_hw_operations gicv2_ops = {
     .map_hwdom_extra_mappings = gicv2_map_hwdom_extra_mappings,
     .iomem_deny_access   = gicv2_iomem_deny_access,
     .do_LPI              = gicv2_do_LPI,
+#ifdef CONFIG_SYSTEM_SUSPEND
+    .suspend             = gicv2_suspend,
+    .resume              = gicv2_resume,
+#endif /* CONFIG_SYSTEM_SUSPEND */
 };
 
 /* Set up the GIC */
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index e80fe0ca24..a018bd7715 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -425,6 +425,35 @@ int gic_iomem_deny_access(struct domain *d)
     return gic_hw_ops->iomem_deny_access(d);
 }
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+int gic_suspend(void)
+{
+    /* Must be called by boot CPU#0 with interrupts disabled */
+    ASSERT(!local_irq_is_enabled());
+    ASSERT(!smp_processor_id());
+
+    if ( !gic_hw_ops->suspend || !gic_hw_ops->resume )
+        return -ENOSYS;
+
+    return gic_hw_ops->suspend();
+}
+
+void gic_resume(void)
+{
+    /*
+     * Must be called by boot CPU#0 with interrupts disabled after gic_suspend
+     * has returned successfully.
+     */
+    ASSERT(!local_irq_is_enabled());
+    ASSERT(!smp_processor_id());
+    ASSERT(gic_hw_ops->resume);
+
+    gic_hw_ops->resume();
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 static int cpu_gic_callback(struct notifier_block *nfb,
                             unsigned long action,
                             void *hcpu)
diff --git a/xen/arch/arm/include/asm/gic.h b/xen/arch/arm/include/asm/gic.h
index 541f0eeb80..a706303008 100644
--- a/xen/arch/arm/include/asm/gic.h
+++ b/xen/arch/arm/include/asm/gic.h
@@ -280,6 +280,12 @@ extern int gicv_setup(struct domain *d);
 extern void gic_save_state(struct vcpu *v);
 extern void gic_restore_state(struct vcpu *v);
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+/* Suspend/resume */
+extern int gic_suspend(void);
+extern void gic_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 /* SGI (AKA IPIs) */
 enum gic_sgi {
     GIC_SGI_EVENT_CHECK,
@@ -395,6 +401,12 @@ struct gic_hw_operations {
     int (*iomem_deny_access)(struct domain *d);
     /* Handle LPIs, which require special handling */
     void (*do_LPI)(unsigned int lpi);
+#ifdef CONFIG_SYSTEM_SUSPEND
+    /* Save GIC configuration due to the system suspend */
+    int (*suspend)(void);
+    /* Restore GIC configuration due to the system resume */
+    void (*resume)(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
 };
 
 extern const struct gic_hw_operations *gic_hw_ops;
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 suspend/resume functions
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
  2025-08-11 20:47 ` [PATCH v5 01/12] xen/arm: Add suspend and resume timer helpers Mykola Kvach
  2025-08-11 20:47 ` [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
@ 2025-08-11 20:47 ` Mykola Kvach
  2025-08-23  0:20   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend Mykola Kvach
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

From: Mykola Kvach <mykola_kvach@epam.com>

System suspend may lead to a state where GIC would be powered down.
Therefore, Xen should save/restore the context of GIC on suspend/resume.

Note that the context consists of states of registers which are
controlled by the hypervisor. Other GIC registers which are accessible
by guests are saved/restored on context switch.

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
 xen/arch/arm/gic-v3.c | 233 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 233 insertions(+)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index cd3e1acf79..a9b65ff5d4 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1776,6 +1776,231 @@ static bool gic_dist_supports_lpis(void)
     return (readl_relaxed(GICD + GICD_TYPER) & GICD_TYPE_LPIS);
 }
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+/* GICv3 registers to be saved/restored on system suspend/resume */
+struct gicv3_ctx {
+    struct dist_ctx {
+        uint32_t ctlr;
+        /*
+         * This struct represent block of 32 IRQs
+         * TODO: store extended SPI configuration (GICv3.1+)
+         */
+        struct irq_regs {
+            uint32_t icfgr[2];
+            uint32_t ipriorityr[8];
+            uint64_t irouter[32];
+            uint32_t isactiver;
+            uint32_t isenabler;
+        } *irqs;
+    } dist;
+
+    /* have only one rdist structure for last running CPU during suspend */
+    struct redist_ctx {
+        uint32_t ctlr;
+        /* TODO: handle case when we have more than 16 PPIs (GICv3.1+) */
+        uint32_t icfgr[2];
+        uint32_t igroupr;
+        uint32_t ipriorityr[8];
+        uint32_t isactiver;
+        uint32_t isenabler;
+    } rdist;
+
+    struct cpu_ctx {
+        uint32_t ctlr;
+        uint32_t pmr;
+        uint32_t bpr;
+        uint32_t sre_el2;
+        uint32_t grpen;
+    } cpu;
+};
+
+static struct gicv3_ctx gicv3_ctx;
+
+static void __init gicv3_alloc_context(void)
+{
+    uint32_t blocks = DIV_ROUND_UP(gicv3_info.nr_lines, 32);
+
+    if ( gicv3_its_host_has_its() )
+        return;
+
+    /* according to spec it is possible don't have SPIs */
+    if ( blocks == 1 )
+        return;
+
+    gicv3_ctx.dist.irqs = xzalloc_array(typeof(*gicv3_ctx.dist.irqs), blocks - 1);
+    if ( !gicv3_ctx.dist.irqs )
+        dprintk(XENLOG_ERR,
+                "%s:%d: failed to allocate memory for GICv3 suspend context\n",
+                __func__, __LINE__);
+}
+
+static void gicv3_disable_redist(void)
+{
+    void __iomem* waker = GICD_RDIST_BASE + GICR_WAKER;
+
+    writel_relaxed(readl_relaxed(waker) | GICR_WAKER_ProcessorSleep, waker);
+    while ( (readl_relaxed(waker) & GICR_WAKER_ChildrenAsleep) == 0 );
+}
+
+static int gicv3_suspend(void)
+{
+    unsigned int i;
+    void __iomem *base;
+    typeof(gicv3_ctx.rdist)* rdist = &gicv3_ctx.rdist;
+
+    /* TODO: implement support for ITS */
+    if ( gicv3_its_host_has_its() )
+        return -EOPNOTSUPP;
+
+    if ( !gicv3_ctx.dist.irqs && gicv3_info.nr_lines > NR_GIC_LOCAL_IRQS )
+    {
+        dprintk(XENLOG_WARNING,
+                "%s:%d: GICv3 suspend context is not allocated!\n",
+                __func__, __LINE__);
+        return -ENOMEM;
+    }
+
+    gicv3_save_state(current);
+
+    /* Save GICC configuration */
+    gicv3_ctx.cpu.ctlr     = READ_SYSREG(ICC_CTLR_EL1);
+    gicv3_ctx.cpu.pmr      = READ_SYSREG(ICC_PMR_EL1);
+    gicv3_ctx.cpu.bpr      = READ_SYSREG(ICC_BPR1_EL1);
+    gicv3_ctx.cpu.sre_el2  = READ_SYSREG(ICC_SRE_EL2);
+    gicv3_ctx.cpu.grpen    = READ_SYSREG(ICC_IGRPEN1_EL1);
+
+    gicv3_disable_interface();
+    gicv3_disable_redist();
+
+    /* Save GICR configuration */
+    gicv3_redist_wait_for_rwp();
+
+    base = GICD_RDIST_SGI_BASE;
+
+    rdist->ctlr = readl_relaxed(base + GICR_CTLR);
+
+    /* Set priority on PPI and SGI interrupts */
+    for (i = 0; i < NR_GIC_LOCAL_IRQS / 4; i += 4)
+        rdist->ipriorityr[i] = readl_relaxed(base + GICR_IPRIORITYR0 + 4 * i);
+
+    rdist->isactiver = readl_relaxed(base + GICR_ISACTIVER0);
+    rdist->isenabler = readl_relaxed(base + GICR_ISENABLER0);
+    rdist->igroupr   = readl_relaxed(base + GICR_IGROUPR0);
+    rdist->icfgr[0]  = readl_relaxed(base + GICR_ICFGR0);
+    rdist->icfgr[1]  = readl_relaxed(base + GICR_ICFGR1);
+
+    /* Save GICD configuration */
+    gicv3_dist_wait_for_rwp();
+    gicv3_ctx.dist.ctlr = readl_relaxed(GICD + GICD_CTLR);
+
+    for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
+    {
+        typeof(gicv3_ctx.dist.irqs) irqs = gicv3_ctx.dist.irqs + i - 1;
+        unsigned int irq;
+
+        base = GICD + GICD_ICFGR + 8 * i;
+        irqs->icfgr[0] = readl_relaxed(base);
+        irqs->icfgr[1] = readl_relaxed(base + 4);
+
+        base = GICD + GICD_IPRIORITYR + 32 * i;
+        for ( irq = 0; irq < 8; irq++ )
+            irqs->ipriorityr[irq] = readl_relaxed(base + 4 * irq);
+
+        base = GICD + GICD_IROUTER + 32 * i;
+        for ( irq = 0; irq < 32; irq++ )
+            irqs->irouter[irq] = readq_relaxed_non_atomic(base + 8 * irq);
+
+        irqs->isactiver = readl_relaxed(GICD + GICD_ISACTIVER + 4 * i);
+        irqs->isenabler = readl_relaxed(GICD + GICD_ISENABLER + 4 * i);
+    }
+
+    return 0;
+}
+
+static void gicv3_resume(void)
+{
+    unsigned int i;
+    void __iomem *base;
+    typeof(gicv3_ctx.rdist)* rdist = &gicv3_ctx.rdist;
+
+    if ( !gicv3_ctx.dist.irqs && gicv3_info.nr_lines > NR_GIC_LOCAL_IRQS )
+    {
+        dprintk(XENLOG_WARNING, "%s:%d: GICv3 suspend context not allocated!\n",
+            __func__, __LINE__);
+        return;
+    }
+
+    writel_relaxed(0, GICD + GICD_CTLR);
+
+    for ( i = NR_GIC_LOCAL_IRQS; i < gicv3_info.nr_lines; i += 32 )
+        writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPR + (i / 32) * 4);
+
+    for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
+    {
+        typeof(gicv3_ctx.dist.irqs) irqs = gicv3_ctx.dist.irqs + i - 1;
+        unsigned int irq;
+
+        base = GICD + GICD_ICFGR + 8 * i;
+        writel_relaxed(irqs->icfgr[0], base);
+        writel_relaxed(irqs->icfgr[1], base + 4);
+
+        base = GICD + GICD_IPRIORITYR + 32 * i;
+        for ( irq = 0; irq < 8; irq++ )
+            writel_relaxed(irqs->ipriorityr[irq], base + 4 * irq );
+
+        base = GICD + GICD_IROUTER + 32 * i;
+        for ( irq = 0; irq < 32; irq++ )
+            writeq_relaxed_non_atomic(irqs->irouter[irq], base + 8 * irq);
+
+        writel_relaxed(irqs->isenabler, GICD + GICD_ISENABLER + i * 4);
+        writel_relaxed(irqs->isactiver, GICD + GICD_ISACTIVER + i * 4);
+    }
+
+    writel_relaxed(gicv3_ctx.dist.ctlr, GICD + GICD_CTLR);
+    gicv3_dist_wait_for_rwp();
+
+    /* Restore GICR (Redistributor) configuration */
+    gicv3_enable_redist();
+
+    base = GICD_RDIST_SGI_BASE;
+
+    writel_relaxed(0xffffffff, base + GICR_ICENABLER0);
+    gicv3_redist_wait_for_rwp();
+
+    for (i = 0; i < NR_GIC_LOCAL_IRQS / 4; i += 4)
+        writel_relaxed(rdist->ipriorityr[i], base + GICR_IPRIORITYR0 + i * 4);
+
+    writel_relaxed(rdist->isactiver, base + GICR_ISACTIVER0);
+
+    writel_relaxed(rdist->igroupr,  base + GICR_IGROUPR0);
+    writel_relaxed(rdist->icfgr[0], base + GICR_ICFGR0);
+    writel_relaxed(rdist->icfgr[1], base + GICR_ICFGR1);
+
+    gicv3_redist_wait_for_rwp();
+
+    writel_relaxed(rdist->isenabler, base + GICR_ISENABLER0);
+    writel_relaxed(rdist->ctlr, GICD_RDIST_BASE + GICR_CTLR);
+
+    gicv3_redist_wait_for_rwp();
+
+    WRITE_SYSREG(gicv3_ctx.cpu.sre_el2, ICC_SRE_EL2);
+    isb();
+
+    /* Restore CPU interface (System registers) */
+    WRITE_SYSREG(gicv3_ctx.cpu.pmr,   ICC_PMR_EL1);
+    WRITE_SYSREG(gicv3_ctx.cpu.bpr,   ICC_BPR1_EL1);
+    WRITE_SYSREG(gicv3_ctx.cpu.ctlr,  ICC_CTLR_EL1);
+    WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
+    isb();
+
+    gicv3_hyp_init();
+
+    gicv3_restore_state(current);
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 /* Set up the GIC */
 static int __init gicv3_init(void)
 {
@@ -1850,6 +2075,10 @@ static int __init gicv3_init(void)
 
     gicv3_hyp_init();
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+    gicv3_alloc_context();
+#endif
+
 out:
     spin_unlock(&gicv3.lock);
 
@@ -1889,6 +2118,10 @@ static const struct gic_hw_operations gicv3_ops = {
 #endif
     .iomem_deny_access   = gicv3_iomem_deny_access,
     .do_LPI              = gicv3_do_LPI,
+#ifdef CONFIG_SYSTEM_SUSPEND
+    .suspend             = gicv3_suspend,
+    .resume              = gicv3_resume,
+#endif
 };
 
 static int __init gicv3_dt_preinit(struct dt_device_node *node, const void *data)
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (2 preceding siblings ...)
  2025-08-11 20:47 ` [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 " Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23  0:36   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 05/12] xen/arm: irq: avoid local IRQ descriptors reinit on system resume Mykola Kvach
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

From: Mykola Kvach <mykola_kvach@epam.com>

If we call disable_nonboot_cpus on ARM64 with system_state set
to SYS_STATE_suspend, the following assertion will be triggered:

```
(XEN) [   25.582712] Disabling non-boot CPUs ...
(XEN) [   25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
[...]
(XEN) [   25.975069] Xen call trace:
(XEN) [   25.978353]    [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
(XEN) [   25.984314]    [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
(XEN) [   25.990276]    [<00000a00002747d4>] release_irq+0xe4/0xe8
(XEN) [   25.996152]    [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
(XEN) [   26.003150]    [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
(XEN) [   26.009717]    [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
(XEN) [   26.017148]    [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
(XEN) [   26.023801]    [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
(XEN) [   26.030281]    [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
(XEN) [   26.038057]    [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
(XEN) [   26.045229]    [<00000a00002268a4>] do_tasklet+0x68/0xb0
(XEN) [   26.051018]    [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
(XEN) [   26.057585]    [<00000a0000277e30>] start_secondary+0x21c/0x220
(XEN) [   26.063978]    [<00000a0000361258>] 00000a0000361258
```

This happens because before invoking take_cpu_down via the stop_machine_run
function on the target CPU, stop_machine_run requests
the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
the release_irq function then triggers the assertion:

/*
 * Heap allocations may need TLB flushes which may require IRQs to be
 * enabled (except when only 1 PCPU is online).
 */

This patch adds system state checks to guard calls to request_irq
and release_irq. These calls are now skipped when system_state is
SYS_STATE_{resume,suspend}, preventing unsafe operations during
suspend/resume handling.

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V4:
  - removed the prior tasklet-based workaround in favor of a more
    straightforward and safer solution
  - reworked the approach by adding explicit system state checks around
    request_irq and release_irq calls, skips these calls during suspend
    and resume states to avoid unsafe memory operations when IRQs are
    disabled
---
 xen/arch/arm/gic.c           |  6 ++++++
 xen/arch/arm/tee/ffa_notif.c |  2 +-
 xen/arch/arm/time.c          | 18 ++++++++++++------
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index a018bd7715..9856cb1592 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -388,6 +388,9 @@ void gic_dump_info(struct vcpu *v)
 
 void init_maintenance_interrupt(void)
 {
+    if ( system_state == SYS_STATE_resume )
+        return;
+
     request_irq(gic_hw_ops->info->maintenance_irq, 0, maintenance_interrupt,
                 "irq-maintenance", NULL);
 }
@@ -461,6 +464,9 @@ static int cpu_gic_callback(struct notifier_block *nfb,
     switch ( action )
     {
     case CPU_DYING:
+        if ( system_state == SYS_STATE_suspend )
+            break;
+
         /* This is reverting the work done in init_maintenance_interrupt */
         release_irq(gic_hw_ops->info->maintenance_irq, NULL);
         break;
diff --git a/xen/arch/arm/tee/ffa_notif.c b/xen/arch/arm/tee/ffa_notif.c
index 00efaf8f73..06f715a82b 100644
--- a/xen/arch/arm/tee/ffa_notif.c
+++ b/xen/arch/arm/tee/ffa_notif.c
@@ -347,7 +347,7 @@ void ffa_notif_init_interrupt(void)
 {
     int ret;
 
-    if ( notif_enabled && notif_sri_irq < NR_GIC_SGI )
+    if ( notif_enabled && notif_sri_irq < NR_GIC_SGI && system_state != SYS_STATE_resume )
     {
         /*
          * An error here is unlikely since the primary CPU has already
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index ad984fdfdd..b2e07ade43 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -320,10 +320,13 @@ void init_timer_interrupt(void)
     WRITE_SYSREG(CNTHCTL_EL2_EL1PCTEN, CNTHCTL_EL2);
     disable_physical_timers();
 
-    request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
-                "hyptimer", NULL);
-    request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt,
-                   "virtimer", NULL);
+    if ( system_state != SYS_STATE_resume )
+    {
+        request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
+                    "hyptimer", NULL);
+        request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt,
+                    "virtimer", NULL);
+    }
 
     check_timer_irq_cfg(timer_irq[TIMER_HYP_PPI], "hypervisor");
     check_timer_irq_cfg(timer_irq[TIMER_VIRT_PPI], "virtual");
@@ -338,8 +341,11 @@ static void deinit_timer_interrupt(void)
 {
     disable_physical_timers();
 
-    release_irq(timer_irq[TIMER_HYP_PPI], NULL);
-    release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
+    if ( system_state != SYS_STATE_suspend )
+    {
+        release_irq(timer_irq[TIMER_HYP_PPI], NULL);
+        release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
+    }
 }
 
 /* Wait a set number of microseconds */
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 05/12] xen/arm: irq: avoid local IRQ descriptors reinit on system resume
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (3 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23  0:37   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during " Mykola Kvach
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

From: Mykola Kvach <mykola_kvach@epam.com>

On ARM, during system resume, CPUs are brought online again. This normally
triggers init_local_irq_data, which reinitializes IRQ descriptors for
banked interrupts (SGIs and PPIs).

These descriptors are statically allocated per CPU and retain valid
state across suspend/resume cycles. Re-initializing them on resume is
unnecessary and may result in loss of interrupt configuration or
restored state.

This patch skips init_local_irq_data when system_state is set to
SYS_STATE_resume to preserve banked IRQ descs state during resume.

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
 xen/arch/arm/irq.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 4bbf0b0664..148f184f8b 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -122,6 +122,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
     switch ( action )
     {
     case CPU_UP_PREPARE:
+        /* Skip local IRQ cleanup on resume */
+        if ( system_state == SYS_STATE_resume )
+            break;
+
         rc = init_local_irq_data(cpu);
         if ( rc )
             printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during system resume
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (4 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 05/12] xen/arm: irq: avoid local IRQ descriptors reinit on system resume Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23  0:45   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain Mykola Kvach
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

From: Mykola Kvach <mykola_kvach@epam.com>

On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
and not restored by gic_resume (for secondary cpus).

This patch introduces restore_local_irqs_on_resume, a function that
restores the state of local interrupts on the target CPU during
system resume.

It iterates over all local IRQs and re-enables those that were not
disabled, reprogramming their routing and affinity accordingly.

The function is invoked from start_secondary, ensuring that local IRQ
state is restored early during CPU bring-up after suspend.

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
 xen/arch/arm/irq.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 148f184f8b..b3ff38dc27 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -113,6 +113,47 @@ static int init_local_irq_data(unsigned int cpu)
     return 0;
 }
 
+/*
+ * The first 32 interrupts (PPIs and SGIs) are per-CPU,
+ * so call this function on the target CPU to restore them.
+ *
+ * SPIs are restored via gic_resume.
+ */
+static void restore_local_irqs_on_resume(void)
+{
+    int irq;
+
+    if ( system_state != SYS_STATE_resume )
+        return;
+
+    spin_lock(&local_irqs_type_lock);
+
+    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
+    {
+        struct irq_desc *desc = irq_to_desc(irq);
+
+        spin_lock(&desc->lock);
+
+        if ( test_bit(_IRQ_DISABLED, &desc->status) )
+        {
+            spin_unlock(&desc->lock);
+            continue;
+        }
+
+        /* it is needed to avoid asserts in below calls */
+        set_bit(_IRQ_DISABLED, &desc->status);
+
+        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
+
+        /* _IRQ_DISABLED is cleared by below call */
+        desc->handler->startup(desc);
+
+        spin_unlock(&desc->lock);
+    }
+
+    spin_unlock(&local_irqs_type_lock);
+}
+
 static int cpu_callback(struct notifier_block *nfb, unsigned long action,
                         void *hcpu)
 {
@@ -131,6 +172,9 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
             printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
                    cpu);
         break;
+    case CPU_STARTING:
+        restore_local_irqs_on_resume();
+        break;
     }
 
     return notifier_from_errno(rc);
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (5 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during " Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-12  7:18   ` Jan Beulich
  2025-08-23  1:00   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Jan Beulich, Roger Pau Monné, Saeed Nowshadi,
	Mykyta Poturai, Mykola Kvach

From: Mirela Simonovic <mirela.simonovic@aggios.com>

Trigger Xen suspend when the hardware domain initiates suspend via
SHUTDOWN_suspend. Redirect system suspend to CPU#0 to ensure the
suspend logic runs on the boot CPU, as required.

Introduce full suspend/resume infrastructure gated by CONFIG_SYSTEM_SUSPEND,
including logic to:
 - disable and enable non-boot physical CPUs
 - freeze and thaw domains
 - suspend and resume the GIC, timer and console
 - maintain system state before and after suspend

Remove the restriction in the PSCI interface preventing suspend from the
hardware domain.

Select HAS_SYSTEM_SUSPEND for ARM_64.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v5:
- select HAS_SYSTEM_SUSPEND in ARM_64 instead of ARM
- check llc_coloring_enabled instead of LLC_COLORING during the selection
  of HAS_SYSTEM_SUSPEND config
- call host_system_suspend from guest PSCI system suspend instead of
  arch_domain_shutdown, reducing the complexity of the new code
- update some comments

Changes introduced in V4:
  - drop code for saving and restoring VCPU context (for more information
    refer part 1 of patch series)
  - remove IOMMU suspend and resume calls until they will be implemented
  - move system suspend logic to arch_domain_shutdown, invoked from
    domain_shutdown
  - apply minor fixes such as renaming functions, updating comments, and
    modifying the commit message to reflect these changes
  - add console_end_sync to resume path after system suspend

Changes introduced in V3:
  - merge changes from other commits into this patch (previously stashed):
    1) disable/enable non-boot physical CPUs and freeze/thaw domains during
       suspend/resume
    2) suspend/resume the timer, GIC, console, IOMMU, and hardware domain
---
 xen/arch/arm/Kconfig               |   1 +
 xen/arch/arm/Makefile              |   1 +
 xen/arch/arm/include/asm/suspend.h |  22 +++++
 xen/arch/arm/suspend.c             | 151 +++++++++++++++++++++++++++++
 xen/arch/arm/vpsci.c               |  12 ++-
 xen/common/domain.c                |   4 +
 6 files changed, 190 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/include/asm/suspend.h
 create mode 100644 xen/arch/arm/suspend.c

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index a0c8160474..ccdbaa5bc3 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_64
 	depends on !ARM_32
 	select 64BIT
 	select HAS_FAST_MULTIPLY
+	select HAS_SYSTEM_SUSPEND if UNSUPPORTED
 	select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH
 
 config ARM
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index f833cdf207..3f6247adee 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -51,6 +51,7 @@ obj-y += setup.o
 obj-y += shutdown.o
 obj-y += smp.o
 obj-y += smpboot.o
+obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
 obj-$(CONFIG_SYSCTL) += sysctl.o
 obj-y += time.o
 obj-y += traps.o
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
new file mode 100644
index 0000000000..78d0e2383b
--- /dev/null
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_ARM_SUSPEND_H__
+#define __ASM_ARM_SUSPEND_H__
+
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+int host_system_suspend(void);
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
+#endif /* __ASM_ARM_SUSPEND_H__ */
+
+ /*
+  * Local variables:
+  * mode: C
+  * c-file-style: "BSD"
+  * c-basic-offset: 4
+  * tab-width: 4
+  * indent-tabs-mode: nil
+  * End:
+  */
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
new file mode 100644
index 0000000000..abf4b928ce
--- /dev/null
+++ b/xen/arch/arm/suspend.c
@@ -0,0 +1,151 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/console.h>
+#include <xen/cpu.h>
+#include <xen/llc-coloring.h>
+#include <xen/sched.h>
+
+/*
+ * TODO list:
+ *  - Test system suspend with LLC_COLORING enabled and verify functionality
+ *  - Implement IOMMU suspend/resume handlers and integrate them
+ *    into the suspend/resume path (IPMMU and SMMU)
+ *  - Enable "xl suspend" support on ARM architecture
+ *  - Properly disable Xen timer watchdog from relevant services
+ *  - Add suspend/resume CI test for ARM (QEMU if feasible)
+ *  - Investigate feasibility and need for implementing system suspend on ARM32
+ */
+
+/* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
+static long system_suspend(void *data)
+{
+    int status;
+    unsigned long flags;
+
+    BUG_ON(system_state != SYS_STATE_active);
+
+    system_state = SYS_STATE_suspend;
+    freeze_domains();
+    scheduler_disable();
+
+    /*
+     * Non-boot CPUs have to be disabled on suspend and enabled on resume
+     * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
+     * CPU_OFF to be called by each non-boot CPU. Depending on the underlying
+     * platform capabilities, this may lead to the physical powering down of
+     * CPUs. Tested on Xilinx Zynq Ultrascale+ MPSoC (including power down of
+     * each non-boot CPU).
+     */
+    status = disable_nonboot_cpus();
+    if ( status )
+    {
+        system_state = SYS_STATE_resume;
+        goto resume_nonboot_cpus;
+    }
+
+    time_suspend();
+
+    local_irq_save(flags);
+    status = gic_suspend();
+    if ( status )
+    {
+        system_state = SYS_STATE_resume;
+        goto resume_irqs;
+    }
+
+    printk("Xen suspending...\n");
+
+    console_start_sync();
+    status = console_suspend();
+    if ( status )
+    {
+        dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", status);
+        system_state = SYS_STATE_resume;
+        goto resume_console;
+    }
+
+    /*
+     * Enable identity mapping before entering suspend to simplify
+     * the resume path
+     */
+    update_boot_mapping(true);
+
+    system_state = SYS_STATE_resume;
+    update_boot_mapping(false);
+
+ resume_console:
+    console_resume();
+    console_end_sync();
+
+    gic_resume();
+
+ resume_irqs:
+    local_irq_restore(flags);
+    time_resume();
+
+ resume_nonboot_cpus:
+    /*
+     * The rcu_barrier() has to be added to ensure that the per cpu area is
+     * freed before a non-boot CPU tries to initialize it (_free_percpu_area()
+     * has to be called before the init_percpu_area()). This scenario occurs
+     * when non-boot CPUs are hot-unplugged on suspend and hotplugged on resume.
+     */
+    rcu_barrier();
+    enable_nonboot_cpus();
+    scheduler_enable();
+    thaw_domains();
+
+    system_state = SYS_STATE_active;
+
+    /* The resume of hardware domain should always follow Xen's resume. */
+    domain_resume(hardware_domain);
+
+    printk("Resume (status %d)\n", status);
+    return status;
+}
+
+int host_system_suspend(void)
+{
+    int status;
+
+    /* TODO: drop check after verification that features can work together */
+    if ( llc_coloring_enabled )
+    {
+        dprintk(XENLOG_ERR,
+                "System suspend is not supported with LLC_COLORING enabled\n");
+        return -ENOSYS;
+    }
+
+    /*
+     * system_suspend should be called when Dom0 finalizes the suspend
+     * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
+     * be mapped to any PCPU (this function could be executed by any PCPU).
+     * The suspend procedure has to be finalized by the PCPU#0 (non-boot
+     * PCPUs will be disabled during the suspend).
+     */
+    status = continue_hypercall_on_cpu(0, system_suspend, NULL);
+
+    /*
+     * If an error happened, there is nothing that needs to be done here
+     * because the system_suspend always returns in fully functional state
+     * no matter what the outcome of suspend procedure is. If the system
+     * suspended successfully the function will return 0 after the resume.
+     * Otherwise, if an error is returned it means Xen did not suspended,
+     * but it is still in the same state as if the system_suspend was never
+     * called. We dump a debug message in case of an error for debugging/
+     * logging purpose.
+     */
+    if ( status )
+        dprintk(XENLOG_ERR, "Failed to suspend, errno=%d\n", status);
+
+    return status;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index 67d369a8a2..757e719ea7 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -4,6 +4,7 @@
 #include <xen/types.h>
 
 #include <asm/current.h>
+#include <asm/suspend.h>
 #include <asm/vgic.h>
 #include <asm/vpsci.h>
 #include <asm/event.h>
@@ -214,9 +215,10 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
     struct vcpu *v;
     struct domain *d = current->domain;
 
-    /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
+#ifndef CONFIG_SYSTEM_SUSPEND
     if ( is_hardware_domain(d) )
         return PSCI_NOT_SUPPORTED;
+#endif
 
     /* Ensure that all CPUs other than the calling one are offline */
     domain_lock(d);
@@ -234,6 +236,14 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
     if ( rc )
         return PSCI_DENIED;
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+    if ( is_hardware_domain(d) && host_system_suspend() )
+    {
+        domain_resume_nopause(d);
+        return PSCI_DENIED;
+    }
+#endif
+
     rc = do_setup_vcpu_ctx(current, epoint, cid);
     if ( rc != PSCI_SUCCESS )
     {
diff --git a/xen/common/domain.c b/xen/common/domain.c
index c3609b0cb0..414a691242 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1311,7 +1311,11 @@ int domain_shutdown(struct domain *d, u8 reason)
         d->shutdown_code = reason;
     reason = d->shutdown_code;
 
+#if defined(CONFIG_SYSTEM_SUSPEND) && defined(CONFIG_ARM)
+    if ( reason != SHUTDOWN_suspend && is_hardware_domain(d) )
+#else
     if ( is_hardware_domain(d) )
+#endif
         hwdom_shutdown(reason);
 
     if ( d->is_shutting_down )
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface)
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (6 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23  1:06   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 09/12] xen/arm: Resume memory management on Xen resume Mykola Kvach
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
	Mykyta Poturai, Mykola Kvach

From: Mirela Simonovic <mirela.simonovic@aggios.com>

Invoke PSCI SYSTEM_SUSPEND to finalize Xen's suspend sequence on ARM64 platforms.
Pass the resume entry point (hyp_resume) as the first argument to EL3. The resume
handler is currently a stub and will be implemented later in assembly. Ignore the
context ID argument, as is done in Linux.

Only enable this path when CONFIG_SYSTEM_SUSPEND is set and
PSCI version is >= 1.0.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v4:
  - select the appropriate PSCI SYSTEM_SUSPEND function ID based on platform
  - update comments and commit message to reflect recent changes

Changes in v3:
  - return PSCI_NOT_SUPPORTED instead of a hardcoded 1 on ARM32
  - check PSCI version before invoking SYSTEM_SUSPEND in call_psci_system_suspend
---
 xen/arch/arm/arm64/head.S          |  8 ++++++++
 xen/arch/arm/include/asm/psci.h    |  1 +
 xen/arch/arm/include/asm/suspend.h |  2 ++
 xen/arch/arm/psci.c                | 23 ++++++++++++++++++++++-
 xen/arch/arm/suspend.c             |  5 +++++
 5 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 72c7b24498..3522c497c5 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -561,6 +561,14 @@ END(efi_xen_start)
 
 #endif /* CONFIG_ARM_EFI */
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+FUNC(hyp_resume)
+        b .
+END(hyp_resume)
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 /*
  * Local variables:
  * mode: ASM
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 48a93e6b79..bb3c73496e 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -23,6 +23,7 @@ int call_psci_cpu_on(int cpu);
 void call_psci_cpu_off(void);
 void call_psci_system_off(void);
 void call_psci_system_reset(void);
+int call_psci_system_suspend(void);
 
 /* Range of allocated PSCI function numbers */
 #define	PSCI_FNUM_MIN_VALUE                 _AC(0,U)
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 78d0e2383b..55041a5d06 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -7,6 +7,8 @@
 
 int host_system_suspend(void);
 
+void hyp_resume(void);
+
 #endif /* CONFIG_SYSTEM_SUSPEND */
 
 #endif /* __ASM_ARM_SUSPEND_H__ */
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index b6860a7760..c9d126b195 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -17,17 +17,20 @@
 #include <asm/cpufeature.h>
 #include <asm/psci.h>
 #include <asm/acpi.h>
+#include <asm/suspend.h>
 
 /*
  * While a 64-bit OS can make calls with SMC32 calling conventions, for
  * some calls it is necessary to use SMC64 to pass or return 64-bit values.
- * For such calls PSCI_0_2_FN_NATIVE(x) will choose the appropriate
+ * For such calls PSCI_*_FN_NATIVE(x) will choose the appropriate
  * (native-width) function ID.
  */
 #ifdef CONFIG_ARM_64
 #define PSCI_0_2_FN_NATIVE(name)    PSCI_0_2_FN64_##name
+#define PSCI_1_0_FN_NATIVE(name)    PSCI_1_0_FN64_##name
 #else
 #define PSCI_0_2_FN_NATIVE(name)    PSCI_0_2_FN32_##name
+#define PSCI_1_0_FN_NATIVE(name)    PSCI_1_0_FN32_##name
 #endif
 
 uint32_t psci_ver;
@@ -60,6 +63,24 @@ void call_psci_cpu_off(void)
     }
 }
 
+int call_psci_system_suspend(void)
+{
+#ifdef CONFIG_SYSTEM_SUSPEND
+    struct arm_smccc_res res;
+
+    if ( psci_ver < PSCI_VERSION(1, 0) )
+        return PSCI_NOT_SUPPORTED;
+
+    /* 2nd argument (context ID) is not used */
+    arm_smccc_smc(PSCI_1_0_FN_NATIVE(SYSTEM_SUSPEND), __pa(hyp_resume), &res);
+    return PSCI_RET(res);
+#else
+    dprintk(XENLOG_WARNING,
+            "SYSTEM_SUSPEND not supported (CONFIG_SYSTEM_SUSPEND disabled)\n");
+    return PSCI_NOT_SUPPORTED;
+#endif
+}
+
 void call_psci_system_off(void)
 {
     if ( psci_ver > PSCI_VERSION(0, 1) )
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index abf4b928ce..11e86b7f51 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <asm/psci.h>
 #include <xen/console.h>
 #include <xen/cpu.h>
 #include <xen/llc-coloring.h>
@@ -70,6 +71,10 @@ static long system_suspend(void *data)
      */
     update_boot_mapping(true);
 
+    status = call_psci_system_suspend();
+    if ( status )
+        dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
+
     system_state = SYS_STATE_resume;
     update_boot_mapping(false);
 
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 09/12] xen/arm: Resume memory management on Xen resume
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (7 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-11 20:48 ` [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume Mykola Kvach
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
	Mykyta Poturai, Mykola Kvach

From: Mirela Simonovic <mirela.simonovic@aggios.com>

The MMU must be enabled during the resume path before restoring context,
as virtual addresses are used to access the saved context data.

This patch adds MMU setup during resume by reusing the existing
enable_secondary_cpu_mm function, which enables data cache and the MMU.
Before the MMU is enabled, the content of TTBR0_EL2 is changed to point
to init_ttbr (page tables used at runtime).

On boot, init_ttbr is normally initialized during secondary CPU hotplug.
On uniprocessor systems, this would leave init_ttbr uninitialized,
causing resume to fail. To address this, the boot CPU now sets init_ttbr
during suspend.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v4:
- Drop unnecessary DAIF masking; interrupts are already masked on resume
- Remove leftover TLB flush instructions; flushing is done in enable_mmu
- Avoid setting x19 in hyp_resume; not needed
- Replace prepare_secondary_mm with set_init_ttbr; call it from system_suspend

Changes in v3:
- Update commit message for clarity
- Replace create_page_tables, enable_mmu, and mmu_init_secondary_cpu
  with enable_secondary_cpu_mm
- Move prepare_secondary_mm to start_xen to avoid crash
- Add early UART init during resume

Changes in v2:
- Move hyp_resume to head.S to keep resume logic together
- Simplify hyp_resume using existing helpers: check_cpu_mode, cpu_init,
  create_page_tables, enable_mmu
---
 xen/arch/arm/arm64/head.S     | 16 ++++++++++++++++
 xen/arch/arm/include/asm/mm.h |  2 ++
 xen/arch/arm/mmu/smpboot.c    |  2 +-
 xen/arch/arm/suspend.c        |  2 ++
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 3522c497c5..596e960152 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -564,6 +564,22 @@ END(efi_xen_start)
 #ifdef CONFIG_SYSTEM_SUSPEND
 
 FUNC(hyp_resume)
+        /* Initialize the UART if earlyprintk has been enabled. */
+#ifdef CONFIG_EARLY_PRINTK
+        bl    init_uart
+#endif
+        PRINT_ID("- Xen resuming -\r\n")
+
+        bl    check_cpu_mode
+        bl    cpu_init
+
+        ldr   x0, =start
+        adr   x20, start             /* x20 := paddr (start) */
+        sub   x20, x20, x0           /* x20 := phys-offset */
+        ldr   lr, =mmu_resumed
+        b     enable_secondary_cpu_mm
+
+mmu_resumed:
         b .
 END(hyp_resume)
 
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index fb79aeb088..3400cb2bff 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -365,6 +365,8 @@ static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
     } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
 }
 
+void set_init_ttbr(lpae_t *root);
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
diff --git a/xen/arch/arm/mmu/smpboot.c b/xen/arch/arm/mmu/smpboot.c
index 37e91d72b7..ff508ecf40 100644
--- a/xen/arch/arm/mmu/smpboot.c
+++ b/xen/arch/arm/mmu/smpboot.c
@@ -72,7 +72,7 @@ static void clear_boot_pagetables(void)
     clear_table(boot_third);
 }
 
-static void set_init_ttbr(lpae_t *root)
+void set_init_ttbr(lpae_t *root)
 {
     /*
      * init_ttbr is part of the identity mapping which is read-only. So
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index 11e86b7f51..08b6acaede 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -65,6 +65,8 @@ static long system_suspend(void *data)
         goto resume_console;
     }
 
+    set_init_ttbr(xen_pgtable);
+
     /*
      * Enable identity mapping before entering suspend to simplify
      * the resume path
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (8 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 09/12] xen/arm: Resume memory management on Xen resume Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23 17:34   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
	Mykyta Poturai, Mykola Kvach

From: Mirela Simonovic <mirela.simonovic@aggios.com>

The context of CPU general purpose and system control registers
has to be saved on suspend and restored on resume. This is
implemented in hyp_suspend and before the return from hyp_resume
function. The hyp_suspend is invoked just before the PSCI system
suspend call is issued to the ATF. The hyp_suspend has to return a
non-zero value so that the calling 'if' statement evaluates to true,
causing the system suspend to be invoked. Upon the resume, context
saved on suspend will be restored, including the link register.
Therefore, after restoring the context the control flow will
return to the address pointed by the saved link register, which
is the place from which the hyp_suspend was called. To ensure
that the calling 'if' statement doesn't again evaluate to true
and initiate system suspend, hyp_resume has to return a zero value
after restoring the context.

Note that the order of saving register context into cpu_context
structure has to match the order of restoring.

Support for ARM32 is not implemented. Instead, compilation fails with a
build-time error if suspend is enabled for ARM32.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v4:
- produce build-time error for ARM32 when CONFIG_SYSTEM_SUSPEND is enabled
- use register_t instead of uint64_t in cpu_context structure
---
 xen/arch/arm/arm64/head.S          | 91 +++++++++++++++++++++++++++++-
 xen/arch/arm/include/asm/suspend.h | 20 +++++++
 xen/arch/arm/suspend.c             | 23 +++++++-
 3 files changed, 130 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 596e960152..ad8b48de3a 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -562,6 +562,52 @@ END(efi_xen_start)
 #endif /* CONFIG_ARM_EFI */
 
 #ifdef CONFIG_SYSTEM_SUSPEND
+/*
+ * int hyp_suspend(struct cpu_context *ptr)
+ *
+ * x0 - pointer to the storage where callee's context will be saved
+ *
+ * CPU context saved here will be restored on resume in hyp_resume function.
+ * hyp_suspend shall return a non-zero value. Upon restoring context
+ * hyp_resume shall return value zero instead. From C code that invokes
+ * hyp_suspend, the return value is interpreted to determine whether the context
+ * is saved (hyp_suspend) or restored (hyp_resume).
+ */
+FUNC(hyp_suspend)
+        /* Store callee-saved registers */
+        stp     x19, x20, [x0], #16
+        stp     x21, x22, [x0], #16
+        stp     x23, x24, [x0], #16
+        stp     x25, x26, [x0], #16
+        stp     x27, x28, [x0], #16
+        stp     x29, lr, [x0], #16
+
+        /* Store stack-pointer */
+        mov     x2, sp
+        str     x2, [x0], #8
+
+        /* Store system control registers */
+        mrs     x2, VBAR_EL2
+        str     x2, [x0], #8
+        mrs     x2, VTCR_EL2
+        str     x2, [x0], #8
+        mrs     x2, VTTBR_EL2
+        str     x2, [x0], #8
+        mrs     x2, TPIDR_EL2
+        str     x2, [x0], #8
+        mrs     x2, MDCR_EL2
+        str     x2, [x0], #8
+        mrs     x2, HSTR_EL2
+        str     x2, [x0], #8
+        mrs     x2, CPTR_EL2
+        str     x2, [x0], #8
+        mrs     x2, HCR_EL2
+        str     x2, [x0], #8
+
+        /* hyp_suspend must return a non-zero value */
+        mov     x0, #1
+        ret
+END(hyp_suspend)
 
 FUNC(hyp_resume)
         /* Initialize the UART if earlyprintk has been enabled. */
@@ -580,7 +626,50 @@ FUNC(hyp_resume)
         b     enable_secondary_cpu_mm
 
 mmu_resumed:
-        b .
+        /*
+         * Now we can access the cpu_context, so restore the context here
+         * TODO: can we reuse __context_switch and saved_context struct here ?
+         */
+        ldr     x0, =cpu_context
+
+        /* Restore callee-saved registers */
+        ldp     x19, x20, [x0], #16
+        ldp     x21, x22, [x0], #16
+        ldp     x23, x24, [x0], #16
+        ldp     x25, x26, [x0], #16
+        ldp     x27, x28, [x0], #16
+        ldp     x29, lr, [x0], #16
+
+        /* Restore stack pointer */
+        ldr     x2, [x0], #8
+        mov     sp, x2
+
+        /* Restore system control registers */
+        ldr     x2, [x0], #8
+        msr     VBAR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     VTCR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     VTTBR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     TPIDR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     MDCR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     HSTR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     CPTR_EL2, x2
+        ldr     x2, [x0], #8
+        msr     HCR_EL2, x2
+        isb
+
+        /* Since context is restored return from this function will appear as
+         * return from hyp_suspend. To distinguish a return from hyp_suspend
+         * which is called upon finalizing the suspend, as opposed to return
+         * from this function which executes on resume, we need to return zero
+         * value here. */
+        mov x0, #0
+        ret
 END(hyp_resume)
 
 #endif /* CONFIG_SYSTEM_SUSPEND */
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 55041a5d06..ae71ccb87b 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -5,9 +5,29 @@
 
 #ifdef CONFIG_SYSTEM_SUSPEND
 
+#ifdef CONFIG_ARM_64
+struct cpu_context {
+    register_t callee_regs[12];
+    register_t sp;
+    register_t vbar_el2;
+    register_t vtcr_el2;
+    register_t vttbr_el2;
+    register_t tpidr_el2;
+    register_t mdcr_el2;
+    register_t hstr_el2;
+    register_t cptr_el2;
+    register_t hcr_el2;
+} __aligned(16);
+#else
+#error "Define cpu_context structure for arm32"
+#endif
+
+extern struct cpu_context cpu_context;
+
 int host_system_suspend(void);
 
 void hyp_resume(void);
+int hyp_suspend(struct cpu_context *ptr);
 
 #endif /* CONFIG_SYSTEM_SUSPEND */
 
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index 08b6acaede..b5398e5ca6 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
 #include <asm/psci.h>
+#include <asm/suspend.h>
 #include <xen/console.h>
 #include <xen/cpu.h>
 #include <xen/llc-coloring.h>
@@ -17,6 +18,8 @@
  *  - Investigate feasibility and need for implementing system suspend on ARM32
  */
 
+struct cpu_context cpu_context;
+
 /* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
 static long system_suspend(void *data)
 {
@@ -73,9 +76,23 @@ static long system_suspend(void *data)
      */
     update_boot_mapping(true);
 
-    status = call_psci_system_suspend();
-    if ( status )
-        dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
+    if ( hyp_suspend(&cpu_context) )
+    {
+        status = call_psci_system_suspend();
+        /*
+         * If suspend is finalized properly by above system suspend PSCI call,
+         * the code below in this 'if' branch will never execute. Execution
+         * will continue from hyp_resume which is the hypervisor's resume point.
+         * In hyp_resume CPU context will be restored and since link-register is
+         * restored as well, it will appear to return from hyp_suspend. The
+         * difference in returning from hyp_suspend on system suspend versus
+         * resume is in function's return value: on suspend, the return value is
+         * a non-zero value, on resume it is zero. That is why the control flow
+         * will not re-enter this 'if' branch on resume.
+         */
+        if ( status )
+            dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
+    }
 
     system_state = SYS_STATE_resume;
     update_boot_mapping(false);
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (9 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23 17:48   ` Volodymyr Babchuk
  2025-08-11 20:48 ` [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume Mykola Kvach
  2025-08-11 21:53 ` [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Julien Grall
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Mykola Kvach

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Store and restore active context and micro-TLB registers.

Tested on R-Car H3 Starter Kit.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
 xen/drivers/passthrough/arm/ipmmu-vmsa.c | 269 +++++++++++++++++++++++
 1 file changed, 269 insertions(+)

diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index dac0dd6d46..ced762657a 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -58,6 +58,8 @@
 #define dev_print(dev, lvl, fmt, ...)    \
     printk(lvl "ipmmu: %s: " fmt, dev_name(dev), ## __VA_ARGS__)
 
+#define dev_dbg(dev, fmt, ...)    \
+    dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__)
 #define dev_info(dev, fmt, ...)    \
     dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
 #define dev_warn(dev, fmt, ...)    \
@@ -117,6 +119,23 @@ struct ipmmu_features {
     unsigned int imuctr_ttsel_mask;
 };
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+struct hw_register {
+    const char *reg_name;
+    unsigned int reg_offset;
+    unsigned int reg_data;
+};
+
+struct ipmmu_vmsa_backup {
+    struct device *dev;
+    unsigned int *utlbs_val;
+    unsigned int *asids_val;
+    struct list_head list;
+};
+
+#endif
+
 /* Root/Cache IPMMU device's information */
 struct ipmmu_vmsa_device {
     struct device *dev;
@@ -129,6 +148,9 @@ struct ipmmu_vmsa_device {
     struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
     unsigned int utlb_refcount[IPMMU_UTLB_MAX];
     const struct ipmmu_features *features;
+#ifdef CONFIG_SYSTEM_SUSPEND
+    struct hw_register *reg_backup[IPMMU_CTX_MAX];
+#endif
 };
 
 /*
@@ -534,6 +556,235 @@ static void ipmmu_domain_free_context(struct ipmmu_vmsa_device *mmu,
     spin_unlock_irqrestore(&mmu->lock, flags);
 }
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+static DEFINE_SPINLOCK(ipmmu_devices_backup_lock);
+static LIST_HEAD(ipmmu_devices_backup);
+
+#define HW_REGISTER_BACKUP_SIZE 4
+
+static struct hw_register root_pgtable[IPMMU_CTX_MAX][HW_REGISTER_BACKUP_SIZE] = {
+    [0 ... (IPMMU_CTX_MAX - 1)] = {
+        {"IMTTLBR0", IMTTLBR0, 0},
+        {"IMTTUBR0", IMTTUBR0, 0},
+        {"IMTTBCR",  IMTTBCR,  0},
+        {"IMCTR",    IMCTR,    0},
+    }
+};
+
+static uint32_t ipmmu_imuasid_read(struct ipmmu_vmsa_device *mmu,
+                                   unsigned int utlb)
+{
+    return ipmmu_read(mmu, ipmmu_utlb_reg(mmu, IMUASID(utlb)));
+}
+
+static void ipmmu_utlbs_backup(struct ipmmu_vmsa_device *mmu)
+{
+    struct ipmmu_vmsa_backup *backup_data;
+
+    dev_dbg(mmu->dev, "Handle micro-TLBs backup\n");
+
+    spin_lock(&ipmmu_devices_backup_lock);
+
+    list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
+    {
+        struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
+        unsigned int i;
+
+        if ( to_ipmmu(backup_data->dev) != mmu )
+            continue;
+
+        for ( i = 0; i < fwspec->num_ids; i++ )
+        {
+            unsigned int utlb = fwspec->ids[i];
+
+            backup_data->asids_val[i] = ipmmu_imuasid_read(mmu, utlb);
+            backup_data->utlbs_val[i] = ipmmu_imuctr_read(mmu, utlb);
+        }
+    }
+
+    spin_unlock(&ipmmu_devices_backup_lock);
+}
+
+static void ipmmu_utlbs_restore(struct ipmmu_vmsa_device *mmu)
+{
+    struct ipmmu_vmsa_backup *backup_data;
+
+    dev_dbg(mmu->dev, "Handle micro-TLBs restore\n");
+
+    spin_lock(&ipmmu_devices_backup_lock);
+
+    list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
+    {
+        struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
+        unsigned int i;
+
+        if ( to_ipmmu(backup_data->dev) != mmu )
+            continue;
+
+        for ( i = 0; i < fwspec->num_ids; i++ )
+        {
+            unsigned int utlb = fwspec->ids[i];
+
+            ipmmu_imuasid_write(mmu, utlb, backup_data->asids_val[i]);
+            ipmmu_imuctr_write(mmu, utlb, backup_data->utlbs_val[i]);
+        }
+    }
+
+    spin_unlock(&ipmmu_devices_backup_lock);
+}
+
+static void ipmmu_domain_backup_context(struct ipmmu_vmsa_domain *domain)
+{
+    struct ipmmu_vmsa_device *mmu = domain->mmu->root;
+    struct hw_register *reg = mmu->reg_backup[domain->context_id];
+    unsigned int i;
+
+    dev_dbg(mmu->dev, "Handle domain context %u backup\n", domain->context_id);
+
+    for ( i = 0; i < HW_REGISTER_BACKUP_SIZE; i++ )
+        reg[i].reg_data = ipmmu_ctx_read_root(domain, reg[i].reg_offset);
+}
+
+static void ipmmu_domain_restore_context(struct ipmmu_vmsa_domain *domain)
+{
+    struct ipmmu_vmsa_device *mmu = domain->mmu->root;
+    struct hw_register *reg = mmu->reg_backup[domain->context_id];
+    unsigned int i;
+
+    dev_dbg(mmu->dev, "Handle domain context %u restore\n", domain->context_id);
+
+    for ( i = 0; i < HW_REGISTER_BACKUP_SIZE; i++ )
+    {
+        if ( reg[i].reg_offset != IMCTR )
+            ipmmu_ctx_write_root(domain, reg[i].reg_offset, reg[i].reg_data);
+        else
+            ipmmu_ctx_write_all(domain, reg[i].reg_offset,
+                                reg[i].reg_data | IMCTR_FLUSH);
+    }
+}
+
+/*
+ * Xen: Unlike Linux implementation, Xen uses a single driver instance
+ * for handling all IPMMUs. There is no framework for ipmmu_suspend/resume
+ * callbacks to be invoked for each IPMMU device. So, we need to iterate
+ * through all registered IPMMUs performing required actions.
+ *
+ * Also take care of restoring special settings, such as translation
+ * table format, etc.
+ */
+static int __must_check ipmmu_suspend(void)
+{
+    struct ipmmu_vmsa_device *mmu;
+
+    if ( !iommu_enabled )
+        return 0;
+
+    printk(XENLOG_DEBUG "ipmmu: Suspending ...\n");
+
+    spin_lock(&ipmmu_devices_lock);
+
+    list_for_each_entry( mmu, &ipmmu_devices, list )
+    {
+        if ( ipmmu_is_root(mmu) )
+        {
+            unsigned int i;
+
+            for ( i = 0; i < mmu->num_ctx; i++ )
+            {
+                if ( !mmu->domains[i] )
+                    continue;
+                ipmmu_domain_backup_context(mmu->domains[i]);
+            }
+        }
+        else
+            ipmmu_utlbs_backup(mmu);
+    }
+
+    spin_unlock(&ipmmu_devices_lock);
+
+    return 0;
+}
+
+static void ipmmu_resume(void)
+{
+    struct ipmmu_vmsa_device *mmu;
+
+    if ( !iommu_enabled )
+        return;
+
+    printk(XENLOG_DEBUG "ipmmu: Resuming ...\n");
+
+    spin_lock(&ipmmu_devices_lock);
+
+    list_for_each_entry( mmu, &ipmmu_devices, list )
+    {
+        uint32_t reg;
+
+        /* Do not use security group function */
+        reg = IMSCTLR + mmu->features->control_offset_base;
+        ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) & ~IMSCTLR_USE_SECGRP);
+
+        if ( ipmmu_is_root(mmu) )
+        {
+            unsigned int i;
+
+            /* Use stage 2 translation table format */
+            reg = IMSAUXCTLR + mmu->features->control_offset_base;
+            ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) | IMSAUXCTLR_S2PTE);
+
+            for ( i = 0; i < mmu->num_ctx; i++ )
+            {
+                if ( !mmu->domains[i] )
+                    continue;
+                ipmmu_domain_restore_context(mmu->domains[i]);
+            }
+        }
+        else
+            ipmmu_utlbs_restore(mmu);
+    }
+
+    spin_unlock(&ipmmu_devices_lock);
+}
+
+static int ipmmu_alloc_ctx_suspend(struct device *dev)
+{
+    struct ipmmu_vmsa_backup *backup_data;
+    unsigned int *utlbs_val, *asids_val;
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    utlbs_val = xzalloc_array(unsigned int, fwspec->num_ids);
+    if ( !utlbs_val )
+        return -ENOMEM;
+
+    asids_val = xzalloc_array(unsigned int, fwspec->num_ids);
+    if ( !asids_val )
+    {
+        xfree(utlbs_val);
+        return -ENOMEM;
+    }
+
+    backup_data = xzalloc(struct ipmmu_vmsa_backup);
+    if ( !backup_data )
+    {
+        xfree(utlbs_val);
+        xfree(asids_val);
+        return -ENOMEM;
+    }
+
+    backup_data->dev = dev;
+    backup_data->utlbs_val = utlbs_val;
+    backup_data->asids_val = asids_val;
+
+    spin_lock(&ipmmu_devices_backup_lock);
+    list_add(&backup_data->list, &ipmmu_devices_backup);
+    spin_unlock(&ipmmu_devices_backup_lock);
+
+    return 0;
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
 static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
 {
     uint64_t ttbr;
@@ -546,6 +797,9 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
         return ret;
 
     domain->context_id = ret;
+#ifdef CONFIG_SYSTEM_SUSPEND
+    domain->mmu->root->reg_backup[ret] = root_pgtable[ret];
+#endif
 
     /*
      * TTBR0
@@ -602,6 +856,9 @@ static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
     ipmmu_ctx_write_root(domain, IMCTR, IMCTR_FLUSH);
     ipmmu_tlb_sync(domain);
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+    domain->mmu->root->reg_backup[domain->context_id] = NULL;
+#endif
     ipmmu_domain_free_context(domain->mmu->root, domain->context_id);
 }
 
@@ -1307,6 +1564,14 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
     /* Let Xen know that the master device is protected by an IOMMU. */
     dt_device_set_protected(dev_to_dt(dev));
 
+#ifdef CONFIG_SYSTEM_SUSPEND
+    if ( ipmmu_alloc_ctx_suspend(dev) )
+    {
+        dev_err(dev, "Failed to allocate context for suspend\n");
+        return -ENOMEM;
+    }
+#endif
+
     dev_info(dev, "Added master device (IPMMU %s micro-TLBs %u)\n",
              dev_name(fwspec->iommu_dev), fwspec->num_ids);
 
@@ -1372,6 +1637,10 @@ static const struct iommu_ops ipmmu_iommu_ops =
     .unmap_page      = arm_iommu_unmap_page,
     .dt_xlate        = ipmmu_dt_xlate,
     .add_device      = ipmmu_add_device,
+#ifdef CONFIG_SYSTEM_SUSPEND
+    .suspend         = ipmmu_suspend,
+    .resume          = ipmmu_resume,
+#endif
 };
 
 static __init int ipmmu_init(struct dt_device_node *node, const void *data)
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (10 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
@ 2025-08-11 20:48 ` Mykola Kvach
  2025-08-23 17:54   ` Volodymyr Babchuk
  2025-08-11 21:53 ` [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Julien Grall
  12 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-11 20:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Mykola Kvach,
	Oleksandr Andrushchenko

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is done using generic iommu_suspend/resume functions that cause
IOMMU driver specific suspend/resume handlers to be called for enabled
IOMMU (if one has suspend/resume driver handlers implemented).

These handlers work only when IPMMU-VMSA is used; otherwise,
we return an error during system suspend requests.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
Tested-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 xen/arch/arm/suspend.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index b5398e5ca6..cb86426ebd 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -4,6 +4,7 @@
 #include <asm/suspend.h>
 #include <xen/console.h>
 #include <xen/cpu.h>
+#include <xen/iommu.h>
 #include <xen/llc-coloring.h>
 #include <xen/sched.h>
 
@@ -49,6 +50,13 @@ static long system_suspend(void *data)
 
     time_suspend();
 
+    status = iommu_suspend();
+    if ( status )
+    {
+        system_state = SYS_STATE_resume;
+        goto resume_time;
+    }
+
     local_irq_save(flags);
     status = gic_suspend();
     if ( status )
@@ -105,6 +113,10 @@ static long system_suspend(void *data)
 
  resume_irqs:
     local_irq_restore(flags);
+
+    iommu_resume();
+
+ resume_time:
     time_resume();
 
  resume_nonboot_cpus:
@@ -140,6 +152,15 @@ int host_system_suspend(void)
         return -ENOSYS;
     }
 
+    /* TODO: drop check once suspend/resume support for SMMU is implemented */
+#ifndef CONFIG_IPMMU_VMSA
+    if ( iommu_enabled )
+    {
+        dprintk(XENLOG_ERR, "IOMMU is enabled, suspend not supported yet\n");
+        return -ENOSYS;
+    }
+#endif
+
     /*
      * system_suspend should be called when Dom0 finalizes the suspend
      * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64
  2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
                   ` (11 preceding siblings ...)
  2025-08-11 20:48 ` [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume Mykola Kvach
@ 2025-08-11 21:53 ` Julien Grall
  2025-08-12  4:58   ` Mykola Kvach
  12 siblings, 1 reply; 38+ messages in thread
From: Julien Grall @ 2025-08-11 21:53 UTC (permalink / raw)
  To: Mykola Kvach, xen-devel
  Cc: Mykola Kvach, Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
	Roger Pau Monné

Hi Mykola,

On 11/08/2025 21:47, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
> 
> This is part 2 of version 5 of the ARM Xen system suspend/resume patch
> series, based on earlier work by Mirela Simonovic and Mykyta Poturai.

Thanks for adding support for Suspend-to-RAM. You mention this is part 
2, can you clarify what's the state of part 1? Is this already merged?

If not, then can you add a link in the cover letter? This would be 
helpful to know the review ordering.

Cheers,

-- 
Julien Grall



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64
  2025-08-11 21:53 ` [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Julien Grall
@ 2025-08-12  4:58   ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-12  4:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
	Jan Beulich, Roger Pau Monné

Hi Julien,

On Tue, Aug 12, 2025 at 12:53 AM Julien Grall <julien@xen.org> wrote:
>
> Hi Mykola,
>
> On 11/08/2025 21:47, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > This is part 2 of version 5 of the ARM Xen system suspend/resume patch
> > series, based on earlier work by Mirela Simonovic and Mykyta Poturai.
>
> Thanks for adding support for Suspend-to-RAM. You mention this is part
> 2, can you clarify what's the state of part 1? Is this already merged?
>
> If not, then can you add a link in the cover letter? This would be
> helpful to know the review ordering.

Thank you for pointing this out. I missed adding the link to the
corresponding part of this patch series. Here is part 1:
https://patchew.org/Xen/cover.1754595198.git.mykola._5Fkvach@epam.com/

>
> Cheers,
>
> --
> Julien Grall
>

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain
  2025-08-11 20:48 ` [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain Mykola Kvach
@ 2025-08-12  7:18   ` Jan Beulich
  2025-08-23  0:53     ` Volodymyr Babchuk
  2025-08-23  1:00   ` Volodymyr Babchuk
  1 sibling, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2025-08-12  7:18 UTC (permalink / raw)
  To: Mykola Kvach, Stefano Stabellini
  Cc: Mirela Simonovic, Julien Grall, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
	Roger Pau Monné, Saeed Nowshadi, Mykyta Poturai,
	Mykola Kvach, xen-devel

On 11.08.2025 22:48, Mykola Kvach wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1311,7 +1311,11 @@ int domain_shutdown(struct domain *d, u8 reason)
>          d->shutdown_code = reason;
>      reason = d->shutdown_code;
>  
> +#if defined(CONFIG_SYSTEM_SUSPEND) && defined(CONFIG_ARM)
> +    if ( reason != SHUTDOWN_suspend && is_hardware_domain(d) )
> +#else
>      if ( is_hardware_domain(d) )
> +#endif
>          hwdom_shutdown(reason);

There better wouldn't be any #ifdef here. Afaict you can easily use
IS_ENABLED(CONFIG_SYSTEM_SUSPEND). For CONFIG_ARM, though, I think some
new abstraction will need adding. E.g. HAS_HWDOM_SUSPEND (with want for
a better, yet not overly long name).

With the hwdom / ctldom separation work in mind I wonder though if it's
really hwdom to be in charge of initiating system suspend. Imo
conceptually that rather would want to be ctldom. Stefano?

Jan


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions
  2025-08-11 20:47 ` [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
@ 2025-08-23  0:01   ` Volodymyr Babchuk
  2025-08-26 13:41     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  0:01 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach

Hi Mykola,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> System suspend may lead to a state where GIC would be powered down.
> Therefore, Xen should save/restore the context of GIC on suspend/resume.
>
> Note that the context consists of states of registers which are
> controlled by the hypervisor. Other GIC registers which are accessible
> by guests are saved/restored on context switch.
>
> Tested on Xilinx Ultrascale+ MPSoC with (and without) powering down
> the GIC.

I think you can drop this sentence, as I am pretty sure this patch
wasn't tested on Ultrascale+ for last five(?) years..

>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in v4:
>   - Add error logging for allocation failures
>
> Changes in v3:
>   - Drop asserts and return error codes instead.
>   - Wrap code with CONFIG_SYSTEM_SUSPEND.
>
> Changes in v2:
>   - Minor fixes after review.
> ---
>  xen/arch/arm/gic-v2.c          | 154 +++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic.c             |  29 +++++++
>  xen/arch/arm/include/asm/gic.h |  12 +++
>  3 files changed, 195 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index b23e72a3d0..dce8f5e924 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -1098,6 +1098,151 @@ static int gicv2_iomem_deny_access(struct domain *d)
>      return iomem_deny_access(d, mfn, mfn + nr);
>  }
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +/* GICv2 registers to be saved/restored on system suspend/resume */
> +struct gicv2_context {
> +    /* GICC context */
> +    uint32_t gicc_ctlr;
> +    uint32_t gicc_pmr;
> +    uint32_t gicc_bpr;
> +    /* GICD context */
> +    uint32_t gicd_ctlr;
> +    uint32_t *gicd_isenabler;
> +    uint32_t *gicd_isactiver;
> +    uint32_t *gicd_ipriorityr;
> +    uint32_t *gicd_itargetsr;
> +    uint32_t *gicd_icfgr;
> +};
> +
> +static struct gicv2_context gicv2_context;
> +
> +static int gicv2_suspend(void)
> +{
> +    unsigned int i;
> +
> +    if ( !gicv2_context.gicd_isenabler )
> +    {
> +        dprintk(XENLOG_WARNING, "%s:%d: GICv2 suspend context not allocated!\n",
> +            __func__, __LINE__);

Should this be ASSERT(gicv2_context.gicd_isenabler) ? Or do we expect
that this can happen during normal runtime?

Also, you don't need to print __func__ and __LINE__ as dprintk does this
for you already:

#define dprintk(lvl, fmt, args...) \
    printk(lvl "%s:%d: " fmt, __FILE__, __LINE__, ## args)



> +        return -ENOMEM;
> +    }
> +
> +    /* Save GICC configuration */
> +    gicv2_context.gicc_ctlr = readl_gicc(GICC_CTLR);
> +    gicv2_context.gicc_pmr = readl_gicc(GICC_PMR);
> +    gicv2_context.gicc_bpr = readl_gicc(GICC_BPR);
> +
> +    /* Save GICD configuration */
> +    gicv2_context.gicd_ctlr = readl_gicd(GICD_CTLR);
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
> +        gicv2_context.gicd_isenabler[i] = readl_gicd(GICD_ISENABLER + i * 4);
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
> +        gicv2_context.gicd_isactiver[i] = readl_gicd(GICD_ISACTIVER + i * 4);
> +

I think you can merge this two loops

> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> +        gicv2_context.gicd_ipriorityr[i] = readl_gicd(GICD_IPRIORITYR + i * 4);
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> +        gicv2_context.gicd_itargetsr[i] = readl_gicd(GICD_ITARGETSR + i * 4);
> +

And this two as well

> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
> +        gicv2_context.gicd_icfgr[i] = readl_gicd(GICD_ICFGR + i * 4);
> +
> +    return 0;
> +}
> +
> +static void gicv2_resume(void)
> +{
> +    unsigned int i;
> +
> +    if ( !gicv2_context.gicd_isenabler )
> +    {
> +        dprintk(XENLOG_WARNING, "%s:%d: GICv2 suspend context not allocated!\n",
> +            __func__, __LINE__);

the same comment as for gicv2_suspend(). Actually, we should not reach
here in any case.

> +        return;
> +    }
> +
> +    gicv2_cpu_disable();
> +    /* Disable distributor */
> +    writel_gicd(0, GICD_CTLR);
> +
> +    /* Restore GICD configuration */
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
> +        writel_gicd(0xffffffff, GICD_ICENABLER + i * 4);
> +        writel_gicd(gicv2_context.gicd_isenabler[i], GICD_ISENABLER + i * 4);
> +    }
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
> +        writel_gicd(0xffffffff, GICD_ICACTIVER + i * 4);
> +        writel_gicd(gicv2_context.gicd_isactiver[i], GICD_ISACTIVER + i * 4);
> +    }
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> +        writel_gicd(gicv2_context.gicd_ipriorityr[i], GICD_IPRIORITYR + i * 4);
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> +        writel_gicd(gicv2_context.gicd_itargetsr[i], GICD_ITARGETSR + i * 4);
> +
> +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
> +        writel_gicd(gicv2_context.gicd_icfgr[i], GICD_ICFGR + i * 4);
> +
> +    /* Make sure all registers are restored and enable distributor */
> +    writel_gicd(gicv2_context.gicd_ctlr | GICD_CTL_ENABLE, GICD_CTLR);
> +
> +    /* Restore GIC CPU interface configuration */
> +    writel_gicc(gicv2_context.gicc_pmr, GICC_PMR);
> +    writel_gicc(gicv2_context.gicc_bpr, GICC_BPR);
> +
> +    /* Enable GIC CPU interface */
> +    writel_gicc(gicv2_context.gicc_ctlr | GICC_CTL_ENABLE | GICC_CTL_EOI,
> +                GICC_CTLR);
> +}
> +
> +static void gicv2_alloc_context(struct gicv2_context *gc)
> +{
> +    uint32_t n = gicv2_info.nr_lines;
> +
> +    gc->gicd_isenabler = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
> +    if ( !gc->gicd_isenabler )
> +        goto err_free;
> +
> +    gc->gicd_isactiver = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
> +    if ( !gc->gicd_isactiver )
> +        goto err_free;
> +
> +    gc->gicd_itargetsr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
> +    if ( !gc->gicd_itargetsr )
> +        goto err_free;
> +
> +    gc->gicd_ipriorityr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
> +    if ( !gc->gicd_ipriorityr )
> +        goto err_free;
> +
> +    gc->gicd_icfgr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 16));
> +    if ( !gc->gicd_icfgr )
> +        goto err_free;
> +
> +    return;
> +
> + err_free:
> +    dprintk(XENLOG_ERR,
> +            "%s:%d: failed to allocate memory for GICv2 suspend context\n",
> +            __func__, __LINE__);

Okay, this partially answers to my previous question about memory
allocation. So it is possible to have active system without space for
GIC context... But this basically renders system un-suspendable. I think
this warrants non-debug print to warn user that suspend will be not
available. 

[...]

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 suspend/resume functions
  2025-08-11 20:47 ` [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 " Mykola Kvach
@ 2025-08-23  0:20   ` Volodymyr Babchuk
  2025-08-26 13:41     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  0:20 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel


Hi,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mykola Kvach <mykola_kvach@epam.com>
>
> System suspend may lead to a state where GIC would be powered down.
> Therefore, Xen should save/restore the context of GIC on suspend/resume.
>
> Note that the context consists of states of registers which are
> controlled by the hypervisor. Other GIC registers which are accessible
> by guests are saved/restored on context switch.
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
>  xen/arch/arm/gic-v3.c | 233 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 233 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index cd3e1acf79..a9b65ff5d4 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1776,6 +1776,231 @@ static bool gic_dist_supports_lpis(void)
>      return (readl_relaxed(GICD + GICD_TYPER) & GICD_TYPE_LPIS);
>  }
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +/* GICv3 registers to be saved/restored on system suspend/resume */
> +struct gicv3_ctx {
> +    struct dist_ctx {
> +        uint32_t ctlr;
> +        /*
> +         * This struct represent block of 32 IRQs
> +         * TODO: store extended SPI configuration (GICv3.1+)
> +         */
> +        struct irq_regs {
> +            uint32_t icfgr[2];
> +            uint32_t ipriorityr[8];
> +            uint64_t irouter[32];
> +            uint32_t isactiver;
> +            uint32_t isenabler;
> +        } *irqs;
> +    } dist;
> +
> +    /* have only one rdist structure for last running CPU during suspend */
> +    struct redist_ctx {
> +        uint32_t ctlr;
> +        /* TODO: handle case when we have more than 16 PPIs (GICv3.1+) */
> +        uint32_t icfgr[2];
> +        uint32_t igroupr;
> +        uint32_t ipriorityr[8];
> +        uint32_t isactiver;
> +        uint32_t isenabler;
> +    } rdist;
> +
> +    struct cpu_ctx {
> +        uint32_t ctlr;
> +        uint32_t pmr;
> +        uint32_t bpr;
> +        uint32_t sre_el2;
> +        uint32_t grpen;
> +    } cpu;
> +};
> +
> +static struct gicv3_ctx gicv3_ctx;
> +
> +static void __init gicv3_alloc_context(void)
> +{
> +    uint32_t blocks = DIV_ROUND_UP(gicv3_info.nr_lines, 32);
> +
> +    if ( gicv3_its_host_has_its() )
> +        return;

I think this needs a comment at least. And/or printk() message. Because
for it is unclear why we are doing nothing if host has ITS

> +
> +    /* according to spec it is possible don't have SPIs */
> +    if ( blocks == 1 )
> +        return;
> +
> +    gicv3_ctx.dist.irqs = xzalloc_array(typeof(*gicv3_ctx.dist.irqs), blocks - 1);
> +    if ( !gicv3_ctx.dist.irqs )
> +        dprintk(XENLOG_ERR,
> +                "%s:%d: failed to allocate memory for GICv3 suspend context\n",
> +                __func__, __LINE__);

dprintk() already prints function and line. Here and everywhere in this
patch.

> +}
> +
> +static void gicv3_disable_redist(void)
> +{
> +    void __iomem* waker = GICD_RDIST_BASE + GICR_WAKER;
> +
> +    writel_relaxed(readl_relaxed(waker) | GICR_WAKER_ProcessorSleep, waker);
> +    while ( (readl_relaxed(waker) & GICR_WAKER_ChildrenAsleep) == 0 );
> +}
> +
> +static int gicv3_suspend(void)
> +{
> +    unsigned int i;
> +    void __iomem *base;
> +    typeof(gicv3_ctx.rdist)* rdist = &gicv3_ctx.rdist;
> +
> +    /* TODO: implement support for ITS */
> +    if ( gicv3_its_host_has_its() )
> +        return -EOPNOTSUPP;
> +
> +    if ( !gicv3_ctx.dist.irqs && gicv3_info.nr_lines > NR_GIC_LOCAL_IRQS )
> +    {
> +        dprintk(XENLOG_WARNING,
> +                "%s:%d: GICv3 suspend context is not allocated!\n",
> +                __func__, __LINE__);
> +        return -ENOMEM;
> +    }
> +
> +    gicv3_save_state(current);
> +
> +    /* Save GICC configuration */
> +    gicv3_ctx.cpu.ctlr     = READ_SYSREG(ICC_CTLR_EL1);
> +    gicv3_ctx.cpu.pmr      = READ_SYSREG(ICC_PMR_EL1);
> +    gicv3_ctx.cpu.bpr      = READ_SYSREG(ICC_BPR1_EL1);
> +    gicv3_ctx.cpu.sre_el2  = READ_SYSREG(ICC_SRE_EL2);
> +    gicv3_ctx.cpu.grpen    = READ_SYSREG(ICC_IGRPEN1_EL1);
> +
> +    gicv3_disable_interface();
> +    gicv3_disable_redist();
> +
> +    /* Save GICR configuration */
> +    gicv3_redist_wait_for_rwp();
> +
> +    base = GICD_RDIST_SGI_BASE;
> +
> +    rdist->ctlr = readl_relaxed(base + GICR_CTLR);
> +
> +    /* Set priority on PPI and SGI interrupts */

Probably you wanted to say "Save priority..."

> +    for (i = 0; i < NR_GIC_LOCAL_IRQS / 4; i += 4)
> +        rdist->ipriorityr[i] = readl_relaxed(base + GICR_IPRIORITYR0 + 4 * i);

Is this correct? You are writing to every 4th rdist->ipriorityr and
reading every 4th GICR_IPRIORITYR<n>

> +
> +    rdist->isactiver = readl_relaxed(base + GICR_ISACTIVER0);
> +    rdist->isenabler = readl_relaxed(base + GICR_ISENABLER0);
> +    rdist->igroupr   = readl_relaxed(base + GICR_IGROUPR0);
> +    rdist->icfgr[0]  = readl_relaxed(base + GICR_ICFGR0);
> +    rdist->icfgr[1]  = readl_relaxed(base + GICR_ICFGR1);
> +
> +    /* Save GICD configuration */
> +    gicv3_dist_wait_for_rwp();
> +    gicv3_ctx.dist.ctlr = readl_relaxed(GICD + GICD_CTLR);
> +
> +    for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> +    {
> +        typeof(gicv3_ctx.dist.irqs) irqs = gicv3_ctx.dist.irqs + i - 1;
> +        unsigned int irq;
> +
> +        base = GICD + GICD_ICFGR + 8 * i;
> +        irqs->icfgr[0] = readl_relaxed(base);
> +        irqs->icfgr[1] = readl_relaxed(base + 4);
> +
> +        base = GICD + GICD_IPRIORITYR + 32 * i;
> +        for ( irq = 0; irq < 8; irq++ )
> +            irqs->ipriorityr[irq] = readl_relaxed(base + 4 * irq);
> +
> +        base = GICD + GICD_IROUTER + 32 * i;
> +        for ( irq = 0; irq < 32; irq++ )
> +            irqs->irouter[irq] = readq_relaxed_non_atomic(base + 8 * irq);
> +
> +        irqs->isactiver = readl_relaxed(GICD + GICD_ISACTIVER + 4 * i);
> +        irqs->isenabler = readl_relaxed(GICD + GICD_ISENABLER + 4 * i);
> +    }
> +
> +    return 0;
> +}
> +
> +static void gicv3_resume(void)
> +{
> +    unsigned int i;
> +    void __iomem *base;
> +    typeof(gicv3_ctx.rdist)* rdist = &gicv3_ctx.rdist;
> +
> +    if ( !gicv3_ctx.dist.irqs && gicv3_info.nr_lines > NR_GIC_LOCAL_IRQS )
> +    {
> +        dprintk(XENLOG_WARNING, "%s:%d: GICv3 suspend context not allocated!\n",
> +            __func__, __LINE__);
> +        return;
> +    }
> +
> +    writel_relaxed(0, GICD + GICD_CTLR);
> +
> +    for ( i = NR_GIC_LOCAL_IRQS; i < gicv3_info.nr_lines; i += 32 )
> +        writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPR + (i / 32) * 4);
> +
> +    for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> +    {
> +        typeof(gicv3_ctx.dist.irqs) irqs = gicv3_ctx.dist.irqs + i - 1;
> +        unsigned int irq;
> +
> +        base = GICD + GICD_ICFGR + 8 * i;
> +        writel_relaxed(irqs->icfgr[0], base);
> +        writel_relaxed(irqs->icfgr[1], base + 4);
> +
> +        base = GICD + GICD_IPRIORITYR + 32 * i;
> +        for ( irq = 0; irq < 8; irq++ )
> +            writel_relaxed(irqs->ipriorityr[irq], base + 4 * irq );

style: space before )

> +
> +        base = GICD + GICD_IROUTER + 32 * i;
> +        for ( irq = 0; irq < 32; irq++ )
> +            writeq_relaxed_non_atomic(irqs->irouter[irq], base + 8 * irq);
> +
> +        writel_relaxed(irqs->isenabler, GICD + GICD_ISENABLER + i * 4);
> +        writel_relaxed(irqs->isactiver, GICD + GICD_ISACTIVER + i * 4);
> +    }
> +
> +    writel_relaxed(gicv3_ctx.dist.ctlr, GICD + GICD_CTLR);
> +    gicv3_dist_wait_for_rwp();
> +
> +    /* Restore GICR (Redistributor) configuration */
> +    gicv3_enable_redist();
> +
> +    base = GICD_RDIST_SGI_BASE;
> +
> +    writel_relaxed(0xffffffff, base + GICR_ICENABLER0);
> +    gicv3_redist_wait_for_rwp();
> +
> +    for (i = 0; i < NR_GIC_LOCAL_IRQS / 4; i += 4)
> +        writel_relaxed(rdist->ipriorityr[i], base + GICR_IPRIORITYR0 + i * 4);

Is this correct? You are writing to every 4th GICR_IPRIORITYR<n>

> +
> +    writel_relaxed(rdist->isactiver, base + GICR_ISACTIVER0);
> +
> +    writel_relaxed(rdist->igroupr,  base + GICR_IGROUPR0);
> +    writel_relaxed(rdist->icfgr[0], base + GICR_ICFGR0);
> +    writel_relaxed(rdist->icfgr[1], base + GICR_ICFGR1);
> +
> +    gicv3_redist_wait_for_rwp();
> +
> +    writel_relaxed(rdist->isenabler, base + GICR_ISENABLER0);
> +    writel_relaxed(rdist->ctlr, GICD_RDIST_BASE + GICR_CTLR);
> +
> +    gicv3_redist_wait_for_rwp();
> +
> +    WRITE_SYSREG(gicv3_ctx.cpu.sre_el2, ICC_SRE_EL2);
> +    isb();
> +
> +    /* Restore CPU interface (System registers) */
> +    WRITE_SYSREG(gicv3_ctx.cpu.pmr,   ICC_PMR_EL1);
> +    WRITE_SYSREG(gicv3_ctx.cpu.bpr,   ICC_BPR1_EL1);
> +    WRITE_SYSREG(gicv3_ctx.cpu.ctlr,  ICC_CTLR_EL1);
> +    WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
> +    isb();
> +
> +    gicv3_hyp_init();
> +
> +    gicv3_restore_state(current);
> +}
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
>  /* Set up the GIC */
>  static int __init gicv3_init(void)
>  {
> @@ -1850,6 +2075,10 @@ static int __init gicv3_init(void)
>  
>      gicv3_hyp_init();
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    gicv3_alloc_context();
> +#endif
> +
>  out:
>      spin_unlock(&gicv3.lock);
>  
> @@ -1889,6 +2118,10 @@ static const struct gic_hw_operations gicv3_ops = {
>  #endif
>      .iomem_deny_access   = gicv3_iomem_deny_access,
>      .do_LPI              = gicv3_do_LPI,
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    .suspend             = gicv3_suspend,
> +    .resume              = gicv3_resume,
> +#endif
>  };
>  
>  static int __init gicv3_dt_preinit(struct dt_device_node *node, const void *data)

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend
  2025-08-11 20:48 ` [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend Mykola Kvach
@ 2025-08-23  0:36   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  0:36 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel


Hi Mykola,

Mykola Kvach <xakep.amatop@gmail.com> writes:

While I approve the change, the commit message is somewhat
unclear. Maybe "Don't release IRQs on suspend" will be better?

> From: Mykola Kvach <mykola_kvach@epam.com>
>
> If we call disable_nonboot_cpus on ARM64 with system_state set
> to SYS_STATE_suspend, the following assertion will be triggered:
>
> ```
> (XEN) [   25.582712] Disabling non-boot CPUs ...
> (XEN) [   25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
> [...]
> (XEN) [   25.975069] Xen call trace:
> (XEN) [   25.978353]    [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
> (XEN) [   25.984314]    [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
> (XEN) [   25.990276]    [<00000a00002747d4>] release_irq+0xe4/0xe8
> (XEN) [   25.996152]    [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
> (XEN) [   26.003150]    [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
> (XEN) [   26.009717]    [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
> (XEN) [   26.017148]    [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
> (XEN) [   26.023801]    [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
> (XEN) [   26.030281]    [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
> (XEN) [   26.038057]    [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
> (XEN) [   26.045229]    [<00000a00002268a4>] do_tasklet+0x68/0xb0
> (XEN) [   26.051018]    [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
> (XEN) [   26.057585]    [<00000a0000277e30>] start_secondary+0x21c/0x220
> (XEN) [   26.063978]    [<00000a0000361258>] 00000a0000361258
> ```
>
> This happens because before invoking take_cpu_down via the stop_machine_run
> function on the target CPU, stop_machine_run requests
> the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
> the release_irq function then triggers the assertion:
>
> /*
>  * Heap allocations may need TLB flushes which may require IRQs to be
>  * enabled (except when only 1 PCPU is online).
>  */
>
> This patch adds system state checks to guard calls to request_irq
> and release_irq. These calls are now skipped when system_state is
> SYS_STATE_{resume,suspend}, preventing unsafe operations during
> suspend/resume handling.

If any call to release_irq() during suspend will trigger ASSERT, and it
is fine to leave IRQs as is during suspend, maybe it will be easier to
put

+        if ( system_state == SYS_STATE_suspend )
+            return;

straight into release_irq() code? This will be easier than playing
whack-a-mole when some other patch will add another release_irq() call
somewhere.


>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V4:
>   - removed the prior tasklet-based workaround in favor of a more
>     straightforward and safer solution
>   - reworked the approach by adding explicit system state checks around
>     request_irq and release_irq calls, skips these calls during suspend
>     and resume states to avoid unsafe memory operations when IRQs are
>     disabled
> ---
>  xen/arch/arm/gic.c           |  6 ++++++
>  xen/arch/arm/tee/ffa_notif.c |  2 +-
>  xen/arch/arm/time.c          | 18 ++++++++++++------
>  3 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index a018bd7715..9856cb1592 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -388,6 +388,9 @@ void gic_dump_info(struct vcpu *v)
>  
>  void init_maintenance_interrupt(void)
>  {
> +    if ( system_state == SYS_STATE_resume )
> +        return;
> +
>      request_irq(gic_hw_ops->info->maintenance_irq, 0, maintenance_interrupt,
>                  "irq-maintenance", NULL);
>  }
> @@ -461,6 +464,9 @@ static int cpu_gic_callback(struct notifier_block *nfb,
>      switch ( action )
>      {
>      case CPU_DYING:
> +        if ( system_state == SYS_STATE_suspend )
> +            break;
> +
>          /* This is reverting the work done in init_maintenance_interrupt */
>          release_irq(gic_hw_ops->info->maintenance_irq, NULL);
>          break;
> diff --git a/xen/arch/arm/tee/ffa_notif.c b/xen/arch/arm/tee/ffa_notif.c
> index 00efaf8f73..06f715a82b 100644
> --- a/xen/arch/arm/tee/ffa_notif.c
> +++ b/xen/arch/arm/tee/ffa_notif.c
> @@ -347,7 +347,7 @@ void ffa_notif_init_interrupt(void)
>  {
>      int ret;
>  
> -    if ( notif_enabled && notif_sri_irq < NR_GIC_SGI )
> +    if ( notif_enabled && notif_sri_irq < NR_GIC_SGI && system_state != SYS_STATE_resume )
>      {
>          /*
>           * An error here is unlikely since the primary CPU has already
> diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
> index ad984fdfdd..b2e07ade43 100644
> --- a/xen/arch/arm/time.c
> +++ b/xen/arch/arm/time.c
> @@ -320,10 +320,13 @@ void init_timer_interrupt(void)
>      WRITE_SYSREG(CNTHCTL_EL2_EL1PCTEN, CNTHCTL_EL2);
>      disable_physical_timers();
>  
> -    request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
> -                "hyptimer", NULL);
> -    request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt,
> -                   "virtimer", NULL);
> +    if ( system_state != SYS_STATE_resume )
> +    {
> +        request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
> +                    "hyptimer", NULL);
> +        request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt,
> +                    "virtimer", NULL);
> +    }
>  
>      check_timer_irq_cfg(timer_irq[TIMER_HYP_PPI], "hypervisor");
>      check_timer_irq_cfg(timer_irq[TIMER_VIRT_PPI], "virtual");
> @@ -338,8 +341,11 @@ static void deinit_timer_interrupt(void)
>  {
>      disable_physical_timers();
>  
> -    release_irq(timer_irq[TIMER_HYP_PPI], NULL);
> -    release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
> +    if ( system_state != SYS_STATE_suspend )
> +    {
> +        release_irq(timer_irq[TIMER_HYP_PPI], NULL);
> +        release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
> +    }
>  }
>  
>  /* Wait a set number of microseconds */

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 05/12] xen/arm: irq: avoid local IRQ descriptors reinit on system resume
  2025-08-11 20:48 ` [PATCH v5 05/12] xen/arm: irq: avoid local IRQ descriptors reinit on system resume Mykola Kvach
@ 2025-08-23  0:37   ` Volodymyr Babchuk
  0 siblings, 0 replies; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  0:37 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mykola Kvach <mykola_kvach@epam.com>
>
> On ARM, during system resume, CPUs are brought online again. This normally
> triggers init_local_irq_data, which reinitializes IRQ descriptors for
> banked interrupts (SGIs and PPIs).
>
> These descriptors are statically allocated per CPU and retain valid
> state across suspend/resume cycles. Re-initializing them on resume is
> unnecessary and may result in loss of interrupt configuration or
> restored state.
>
> This patch skips init_local_irq_data when system_state is set to
> SYS_STATE_resume to preserve banked IRQ descs state during resume.
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>

Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

> ---
>  xen/arch/arm/irq.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> index 4bbf0b0664..148f184f8b 100644
> --- a/xen/arch/arm/irq.c
> +++ b/xen/arch/arm/irq.c
> @@ -122,6 +122,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>      switch ( action )
>      {
>      case CPU_UP_PREPARE:
> +        /* Skip local IRQ cleanup on resume */
> +        if ( system_state == SYS_STATE_resume )
> +            break;
> +
>          rc = init_local_irq_data(cpu);
>          if ( rc )
>              printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during system resume
  2025-08-11 20:48 ` [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during " Mykola Kvach
@ 2025-08-23  0:45   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  0:45 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel

Hi Mykola,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mykola Kvach <mykola_kvach@epam.com>
>
> On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> and not restored by gic_resume (for secondary cpus).
>
> This patch introduces restore_local_irqs_on_resume, a function that
> restores the state of local interrupts on the target CPU during
> system resume.
>
> It iterates over all local IRQs and re-enables those that were not
> disabled, reprogramming their routing and affinity accordingly.
>
> The function is invoked from start_secondary, ensuring that local IRQ
> state is restored early during CPU bring-up after suspend.
>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
>  xen/arch/arm/irq.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
>
> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> index 148f184f8b..b3ff38dc27 100644
> --- a/xen/arch/arm/irq.c
> +++ b/xen/arch/arm/irq.c
> @@ -113,6 +113,47 @@ static int init_local_irq_data(unsigned int cpu)
>      return 0;
>  }
>  
> +/*
> + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> + * so call this function on the target CPU to restore them.
> + *
> + * SPIs are restored via gic_resume.
> + */
> +static void restore_local_irqs_on_resume(void)
> +{
> +    int irq;
> +
> +    if ( system_state != SYS_STATE_resume )
> +        return;

Maybe it is better to move this check to restore_local_irqs_on_resume()
caller? You can put ASSERT(system_state == SYS_STATE_resume) there
instead.

I am saying this because name of restore_local_irqs_on_resume() implies that it
should be called only on resume.

> +
> +    spin_lock(&local_irqs_type_lock);
> +
> +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> +    {
> +        struct irq_desc *desc = irq_to_desc(irq);
> +
> +        spin_lock(&desc->lock);
> +
> +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> +        {
> +            spin_unlock(&desc->lock);
> +            continue;
> +        }
> +
> +        /* it is needed to avoid asserts in below calls */
> +        set_bit(_IRQ_DISABLED, &desc->status);

Assert in the call below isn't just because. You need to either call
desc->handler->disable() to properly disable the IRQ or justify why it
is fine to just set the bit.

> +
> +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
> +
> +        /* _IRQ_DISABLED is cleared by below call */
> +        desc->handler->startup(desc);
> +
> +        spin_unlock(&desc->lock);
> +    }
> +
> +    spin_unlock(&local_irqs_type_lock);
> +}
> +
>  static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>                          void *hcpu)
>  {
> @@ -131,6 +172,9 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>              printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
>                     cpu);
>          break;
> +    case CPU_STARTING:
> +        restore_local_irqs_on_resume();
> +        break;
>      }
>  
>      return notifier_from_errno(rc);

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain
  2025-08-12  7:18   ` Jan Beulich
@ 2025-08-23  0:53     ` Volodymyr Babchuk
  0 siblings, 0 replies; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  0:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Mykola Kvach, Stefano Stabellini, Mirela Simonovic, Julien Grall,
	Bertrand Marquis, Michal Orzel, Andrew Cooper, Anthony PERARD,
	Roger Pau Monné, Saeed Nowshadi, Mykyta Poturai,
	Mykola Kvach, xen-devel@lists.xenproject.org

Hi Jan,

Jan Beulich <jbeulich@suse.com> writes:

> On 11.08.2025 22:48, Mykola Kvach wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -1311,7 +1311,11 @@ int domain_shutdown(struct domain *d, u8 reason)
>>          d->shutdown_code = reason;
>>      reason = d->shutdown_code;
>>  
>> +#if defined(CONFIG_SYSTEM_SUSPEND) && defined(CONFIG_ARM)
>> +    if ( reason != SHUTDOWN_suspend && is_hardware_domain(d) )
>> +#else
>>      if ( is_hardware_domain(d) )
>> +#endif
>>          hwdom_shutdown(reason);
>
> There better wouldn't be any #ifdef here. Afaict you can easily use
> IS_ENABLED(CONFIG_SYSTEM_SUSPEND). For CONFIG_ARM, though, I think some
> new abstraction will need adding. E.g. HAS_HWDOM_SUSPEND (with want for
> a better, yet not overly long name).
>
> With the hwdom / ctldom separation work in mind I wonder though if it's
> really hwdom to be in charge of initiating system suspend. Imo
> conceptually that rather would want to be ctldom. Stefano?

I am not Stefano, but IMO, this should be ctldom. But only after hwdom
is already suspended, otherwise we'll have problems with devices.

It is unclear which entity should check if hwdom is already suspended,
though. Should be it be ctldom or Xen itself?

And if we already speaking of devices... There can be domains with
passed-through devices. Those domains should be suspended properly, not
just freezed. I think this requires another TODO.

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain
  2025-08-11 20:48 ` [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain Mykola Kvach
  2025-08-12  7:18   ` Jan Beulich
@ 2025-08-23  1:00   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  1 sibling, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  1:00 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach

Hi Mykola,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> Trigger Xen suspend when the hardware domain initiates suspend via
> SHUTDOWN_suspend. Redirect system suspend to CPU#0 to ensure the
> suspend logic runs on the boot CPU, as required.
>
> Introduce full suspend/resume infrastructure gated by CONFIG_SYSTEM_SUSPEND,
> including logic to:
>  - disable and enable non-boot physical CPUs
>  - freeze and thaw domains
>  - suspend and resume the GIC, timer and console
>  - maintain system state before and after suspend
>
> Remove the restriction in the PSCI interface preventing suspend from the
> hardware domain.
>
> Select HAS_SYSTEM_SUSPEND for ARM_64.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in v5:
> - select HAS_SYSTEM_SUSPEND in ARM_64 instead of ARM
> - check llc_coloring_enabled instead of LLC_COLORING during the selection
>   of HAS_SYSTEM_SUSPEND config
> - call host_system_suspend from guest PSCI system suspend instead of
>   arch_domain_shutdown, reducing the complexity of the new code
> - update some comments
>
> Changes introduced in V4:
>   - drop code for saving and restoring VCPU context (for more information
>     refer part 1 of patch series)
>   - remove IOMMU suspend and resume calls until they will be implemented
>   - move system suspend logic to arch_domain_shutdown, invoked from
>     domain_shutdown
>   - apply minor fixes such as renaming functions, updating comments, and
>     modifying the commit message to reflect these changes
>   - add console_end_sync to resume path after system suspend
>
> Changes introduced in V3:
>   - merge changes from other commits into this patch (previously stashed):
>     1) disable/enable non-boot physical CPUs and freeze/thaw domains during
>        suspend/resume
>     2) suspend/resume the timer, GIC, console, IOMMU, and hardware domain
> ---
>  xen/arch/arm/Kconfig               |   1 +
>  xen/arch/arm/Makefile              |   1 +
>  xen/arch/arm/include/asm/suspend.h |  22 +++++
>  xen/arch/arm/suspend.c             | 151 +++++++++++++++++++++++++++++
>  xen/arch/arm/vpsci.c               |  12 ++-
>  xen/common/domain.c                |   4 +
>  6 files changed, 190 insertions(+), 1 deletion(-)
>  create mode 100644 xen/arch/arm/include/asm/suspend.h
>  create mode 100644 xen/arch/arm/suspend.c
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index a0c8160474..ccdbaa5bc3 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -8,6 +8,7 @@ config ARM_64
>  	depends on !ARM_32
>  	select 64BIT
>  	select HAS_FAST_MULTIPLY
> +	select HAS_SYSTEM_SUSPEND if UNSUPPORTED
>  	select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH
>  
>  config ARM
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index f833cdf207..3f6247adee 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -51,6 +51,7 @@ obj-y += setup.o
>  obj-y += shutdown.o
>  obj-y += smp.o
>  obj-y += smpboot.o
> +obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
>  obj-$(CONFIG_SYSCTL) += sysctl.o
>  obj-y += time.o
>  obj-y += traps.o
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> new file mode 100644
> index 0000000000..78d0e2383b
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef __ASM_ARM_SUSPEND_H__
> +#define __ASM_ARM_SUSPEND_H__
> +
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +int host_system_suspend(void);
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
> +#endif /* __ASM_ARM_SUSPEND_H__ */
> +
> + /*
> +  * Local variables:
> +  * mode: C
> +  * c-file-style: "BSD"
> +  * c-basic-offset: 4
> +  * tab-width: 4
> +  * indent-tabs-mode: nil
> +  * End:
> +  */
> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> new file mode 100644
> index 0000000000..abf4b928ce
> --- /dev/null
> +++ b/xen/arch/arm/suspend.c
> @@ -0,0 +1,151 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/console.h>
> +#include <xen/cpu.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/sched.h>
> +
> +/*
> + * TODO list:
> + *  - Test system suspend with LLC_COLORING enabled and verify functionality
> + *  - Implement IOMMU suspend/resume handlers and integrate them
> + *    into the suspend/resume path (IPMMU and SMMU)
> + *  - Enable "xl suspend" support on ARM architecture
> + *  - Properly disable Xen timer watchdog from relevant services
> + *  - Add suspend/resume CI test for ARM (QEMU if feasible)
> + *  - Investigate feasibility and need for implementing system suspend on ARM32
> + */
> +
> +/* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
> +static long system_suspend(void *data)
> +{
> +    int status;
> +    unsigned long flags;
> +
> +    BUG_ON(system_state != SYS_STATE_active);
> +
> +    system_state = SYS_STATE_suspend;
> +    freeze_domains();
> +    scheduler_disable();
> +
> +    /*
> +     * Non-boot CPUs have to be disabled on suspend and enabled on resume
> +     * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
> +     * CPU_OFF to be called by each non-boot CPU. Depending on the underlying
> +     * platform capabilities, this may lead to the physical powering down of
> +     * CPUs. Tested on Xilinx Zynq Ultrascale+ MPSoC (including power down of
> +     * each non-boot CPU).

I don't think that the last part of the comment is relevant in upstream.

> +     */
> +    status = disable_nonboot_cpus();
> +    if ( status )
> +    {
> +        system_state = SYS_STATE_resume;
> +        goto resume_nonboot_cpus;
> +    }
> +
> +    time_suspend();
> +
> +    local_irq_save(flags);
> +    status = gic_suspend();
> +    if ( status )
> +    {
> +        system_state = SYS_STATE_resume;
> +        goto resume_irqs;
> +    }
> +
> +    printk("Xen suspending...\n");
> +
> +    console_start_sync();
> +    status = console_suspend();
> +    if ( status )
> +    {
> +        dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", status);
> +        system_state = SYS_STATE_resume;
> +        goto resume_console;
> +    }
> +
> +    /*
> +     * Enable identity mapping before entering suspend to simplify
> +     * the resume path
> +     */
> +    update_boot_mapping(true);
> +

This puzzles me. I expected actually PSCI suspend call here.

> +    system_state = SYS_STATE_resume;
> +    update_boot_mapping(false);
> +
> + resume_console:
> +    console_resume();
> +    console_end_sync();
> +
> +    gic_resume();
> +
> + resume_irqs:
> +    local_irq_restore(flags);
> +    time_resume();
> +
> + resume_nonboot_cpus:
> +    /*
> +     * The rcu_barrier() has to be added to ensure that the per cpu area is
> +     * freed before a non-boot CPU tries to initialize it (_free_percpu_area()
> +     * has to be called before the init_percpu_area()). This scenario occurs
> +     * when non-boot CPUs are hot-unplugged on suspend and hotplugged on resume.
> +     */
> +    rcu_barrier();
> +    enable_nonboot_cpus();
> +    scheduler_enable();
> +    thaw_domains();
> +
> +    system_state = SYS_STATE_active;
> +
> +    /* The resume of hardware domain should always follow Xen's resume. */
> +    domain_resume(hardware_domain);
> +
> +    printk("Resume (status %d)\n", status);
> +    return status;
> +}
> +
> +int host_system_suspend(void)
> +{
> +    int status;
> +
> +    /* TODO: drop check after verification that features can work together */
> +    if ( llc_coloring_enabled )
> +    {
> +        dprintk(XENLOG_ERR,
> +                "System suspend is not supported with LLC_COLORING enabled\n");
> +        return -ENOSYS;
> +    }
> +
> +    /*
> +     * system_suspend should be called when Dom0 finalizes the suspend
> +     * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
> +     * be mapped to any PCPU (this function could be executed by any PCPU).
> +     * The suspend procedure has to be finalized by the PCPU#0 (non-boot
> +     * PCPUs will be disabled during the suspend).
> +     */
> +    status = continue_hypercall_on_cpu(0, system_suspend, NULL);
> +
> +    /*
> +     * If an error happened, there is nothing that needs to be done here
> +     * because the system_suspend always returns in fully functional state
> +     * no matter what the outcome of suspend procedure is. If the system
> +     * suspended successfully the function will return 0 after the resume.
> +     * Otherwise, if an error is returned it means Xen did not suspended,
> +     * but it is still in the same state as if the system_suspend was never
> +     * called. We dump a debug message in case of an error for debugging/
> +     * logging purpose.
> +     */
> +    if ( status )
> +        dprintk(XENLOG_ERR, "Failed to suspend, errno=%d\n", status);
> +
> +    return status;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> index 67d369a8a2..757e719ea7 100644
> --- a/xen/arch/arm/vpsci.c
> +++ b/xen/arch/arm/vpsci.c
> @@ -4,6 +4,7 @@
>  #include <xen/types.h>
>  
>  #include <asm/current.h>
> +#include <asm/suspend.h>
>  #include <asm/vgic.h>
>  #include <asm/vpsci.h>
>  #include <asm/event.h>
> @@ -214,9 +215,10 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
>      struct vcpu *v;
>      struct domain *d = current->domain;
>  
> -    /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> +#ifndef CONFIG_SYSTEM_SUSPEND
>      if ( is_hardware_domain(d) )
>          return PSCI_NOT_SUPPORTED;
> +#endif
>  
>      /* Ensure that all CPUs other than the calling one are offline */
>      domain_lock(d);
> @@ -234,6 +236,14 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
>      if ( rc )
>          return PSCI_DENIED;
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    if ( is_hardware_domain(d) && host_system_suspend() )
> +    {
> +        domain_resume_nopause(d);
> +        return PSCI_DENIED;
> +    }
> +#endif
> +
>      rc = do_setup_vcpu_ctx(current, epoint, cid);
>      if ( rc != PSCI_SUCCESS )
>      {
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index c3609b0cb0..414a691242 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1311,7 +1311,11 @@ int domain_shutdown(struct domain *d, u8 reason)
>          d->shutdown_code = reason;
>      reason = d->shutdown_code;
>  
> +#if defined(CONFIG_SYSTEM_SUSPEND) && defined(CONFIG_ARM)
> +    if ( reason != SHUTDOWN_suspend && is_hardware_domain(d) )
> +#else
>      if ( is_hardware_domain(d) )
> +#endif
>          hwdom_shutdown(reason);
>  
>      if ( d->is_shutting_down )

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface)
  2025-08-11 20:48 ` [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
@ 2025-08-23  1:06   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23  1:06 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach


Hi Mykola,

Sequence of next 3 patches (and previous one) really puzzles me. Can you
first implement hyp_resume() function, then add PSCI_SYSTEM_SUSPEND call
and only then implement system_suspend() function? Why do this backwards?

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> Invoke PSCI SYSTEM_SUSPEND to finalize Xen's suspend sequence on ARM64 platforms.
> Pass the resume entry point (hyp_resume) as the first argument to EL3. The resume
> handler is currently a stub and will be implemented later in assembly. Ignore the
> context ID argument, as is done in Linux.
>
> Only enable this path when CONFIG_SYSTEM_SUSPEND is set and
> PSCI version is >= 1.0.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in v4:
>   - select the appropriate PSCI SYSTEM_SUSPEND function ID based on platform
>   - update comments and commit message to reflect recent changes
>
> Changes in v3:
>   - return PSCI_NOT_SUPPORTED instead of a hardcoded 1 on ARM32
>   - check PSCI version before invoking SYSTEM_SUSPEND in call_psci_system_suspend
> ---
>  xen/arch/arm/arm64/head.S          |  8 ++++++++
>  xen/arch/arm/include/asm/psci.h    |  1 +
>  xen/arch/arm/include/asm/suspend.h |  2 ++
>  xen/arch/arm/psci.c                | 23 ++++++++++++++++++++++-
>  xen/arch/arm/suspend.c             |  5 +++++
>  5 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 72c7b24498..3522c497c5 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -561,6 +561,14 @@ END(efi_xen_start)
>  
>  #endif /* CONFIG_ARM_EFI */
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +FUNC(hyp_resume)
> +        b .
> +END(hyp_resume)
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
>  /*
>   * Local variables:
>   * mode: ASM
> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> index 48a93e6b79..bb3c73496e 100644
> --- a/xen/arch/arm/include/asm/psci.h
> +++ b/xen/arch/arm/include/asm/psci.h
> @@ -23,6 +23,7 @@ int call_psci_cpu_on(int cpu);
>  void call_psci_cpu_off(void);
>  void call_psci_system_off(void);
>  void call_psci_system_reset(void);
> +int call_psci_system_suspend(void);
>  
>  /* Range of allocated PSCI function numbers */
>  #define	PSCI_FNUM_MIN_VALUE                 _AC(0,U)
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> index 78d0e2383b..55041a5d06 100644
> --- a/xen/arch/arm/include/asm/suspend.h
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -7,6 +7,8 @@
>  
>  int host_system_suspend(void);
>  
> +void hyp_resume(void);
> +
>  #endif /* CONFIG_SYSTEM_SUSPEND */
>  
>  #endif /* __ASM_ARM_SUSPEND_H__ */
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index b6860a7760..c9d126b195 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -17,17 +17,20 @@
>  #include <asm/cpufeature.h>
>  #include <asm/psci.h>
>  #include <asm/acpi.h>
> +#include <asm/suspend.h>
>  
>  /*
>   * While a 64-bit OS can make calls with SMC32 calling conventions, for
>   * some calls it is necessary to use SMC64 to pass or return 64-bit values.
> - * For such calls PSCI_0_2_FN_NATIVE(x) will choose the appropriate
> + * For such calls PSCI_*_FN_NATIVE(x) will choose the appropriate
>   * (native-width) function ID.
>   */
>  #ifdef CONFIG_ARM_64
>  #define PSCI_0_2_FN_NATIVE(name)    PSCI_0_2_FN64_##name
> +#define PSCI_1_0_FN_NATIVE(name)    PSCI_1_0_FN64_##name
>  #else
>  #define PSCI_0_2_FN_NATIVE(name)    PSCI_0_2_FN32_##name
> +#define PSCI_1_0_FN_NATIVE(name)    PSCI_1_0_FN32_##name
>  #endif
>  
>  uint32_t psci_ver;
> @@ -60,6 +63,24 @@ void call_psci_cpu_off(void)
>      }
>  }
>  
> +int call_psci_system_suspend(void)
> +{
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    struct arm_smccc_res res;
> +
> +    if ( psci_ver < PSCI_VERSION(1, 0) )
> +        return PSCI_NOT_SUPPORTED;
> +
> +    /* 2nd argument (context ID) is not used */
> +    arm_smccc_smc(PSCI_1_0_FN_NATIVE(SYSTEM_SUSPEND), __pa(hyp_resume), &res);
> +    return PSCI_RET(res);
> +#else
> +    dprintk(XENLOG_WARNING,
> +            "SYSTEM_SUSPEND not supported (CONFIG_SYSTEM_SUSPEND disabled)\n");
> +    return PSCI_NOT_SUPPORTED;
> +#endif
> +}
> +
>  void call_psci_system_off(void)
>  {
>      if ( psci_ver > PSCI_VERSION(0, 1) )
> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> index abf4b928ce..11e86b7f51 100644
> --- a/xen/arch/arm/suspend.c
> +++ b/xen/arch/arm/suspend.c
> @@ -1,5 +1,6 @@
>  /* SPDX-License-Identifier: GPL-2.0-only */
>  
> +#include <asm/psci.h>
>  #include <xen/console.h>
>  #include <xen/cpu.h>
>  #include <xen/llc-coloring.h>
> @@ -70,6 +71,10 @@ static long system_suspend(void *data)
>       */
>      update_boot_mapping(true);
>  
> +    status = call_psci_system_suspend();
> +    if ( status )
> +        dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);

So this is where missing call to PSCI_SYSTEM_SUSPEND is... 

> +
>      system_state = SYS_STATE_resume;
>      update_boot_mapping(false);

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume
  2025-08-11 20:48 ` [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume Mykola Kvach
@ 2025-08-23 17:34   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23 17:34 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach


Hi Mykola,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> The context of CPU general purpose and system control registers
> has to be saved on suspend and restored on resume. This is
> implemented in hyp_suspend and before the return from hyp_resume
> function. The hyp_suspend is invoked just before the PSCI system
> suspend call is issued to the ATF. The hyp_suspend has to return a
> non-zero value so that the calling 'if' statement evaluates to true,
> causing the system suspend to be invoked. Upon the resume, context
> saved on suspend will be restored, including the link register.
> Therefore, after restoring the context the control flow will
> return to the address pointed by the saved link register, which
> is the place from which the hyp_suspend was called. To ensure
> that the calling 'if' statement doesn't again evaluate to true
> and initiate system suspend, hyp_resume has to return a zero value
> after restoring the context.
>
> Note that the order of saving register context into cpu_context
> structure has to match the order of restoring.
>
> Support for ARM32 is not implemented. Instead, compilation fails with a
> build-time error if suspend is enabled for ARM32.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in v4:
> - produce build-time error for ARM32 when CONFIG_SYSTEM_SUSPEND is enabled
> - use register_t instead of uint64_t in cpu_context structure
> ---
>  xen/arch/arm/arm64/head.S          | 91 +++++++++++++++++++++++++++++-
>  xen/arch/arm/include/asm/suspend.h | 20 +++++++
>  xen/arch/arm/suspend.c             | 23 +++++++-
>  3 files changed, 130 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 596e960152..ad8b48de3a 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -562,6 +562,52 @@ END(efi_xen_start)
>  #endif /* CONFIG_ARM_EFI */
>  
>  #ifdef CONFIG_SYSTEM_SUSPEND
> +/*
> + * int hyp_suspend(struct cpu_context *ptr)
> + *
> + * x0 - pointer to the storage where callee's context will be saved
> + *
> + * CPU context saved here will be restored on resume in hyp_resume function.
> + * hyp_suspend shall return a non-zero value. Upon restoring context
> + * hyp_resume shall return value zero instead. From C code that invokes
> + * hyp_suspend, the return value is interpreted to determine whether the context
> + * is saved (hyp_suspend) or restored (hyp_resume).
> + */
> +FUNC(hyp_suspend)

I don't think that hyp_suspend is the correct name, as this function in
fact suspend_nothing. Maybe "prepare_resume_ctx" will be better?

> +        /* Store callee-saved registers */
> +        stp     x19, x20, [x0], #16

If you have struct cpu_context defined, then you probably should use
define provided by <asm-offsets.h> to access struct fields. Otherwise,
it will be really easy to get desync between struct definition and this
asm code.

> +        stp     x21, x22, [x0], #16
> +        stp     x23, x24, [x0], #16
> +        stp     x25, x26, [x0], #16
> +        stp     x27, x28, [x0], #16
> +        stp     x29, lr, [x0], #16
> +
> +        /* Store stack-pointer */
> +        mov     x2, sp
> +        str     x2, [x0], #8
> +
> +        /* Store system control registers */
> +        mrs     x2, VBAR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, VTCR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, VTTBR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, TPIDR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, MDCR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, HSTR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, CPTR_EL2
> +        str     x2, [x0], #8
> +        mrs     x2, HCR_EL2
> +        str     x2, [x0], #8
> +
> +        /* hyp_suspend must return a non-zero value */
> +        mov     x0, #1
> +        ret
> +END(hyp_suspend)
>  
>  FUNC(hyp_resume)
>          /* Initialize the UART if earlyprintk has been enabled. */
> @@ -580,7 +626,50 @@ FUNC(hyp_resume)
>          b     enable_secondary_cpu_mm
>  
>  mmu_resumed:
> -        b .
> +        /*
> +         * Now we can access the cpu_context, so restore the context here
> +         * TODO: can we reuse __context_switch and saved_context struct here ?
> +         */

This is a great idea and I like it very much, but sadly saved_context
struct has no fields for system _EL2 registers.

> +        ldr     x0, =cpu_context
> +
> +        /* Restore callee-saved registers */
> +        ldp     x19, x20, [x0], #16
> +        ldp     x21, x22, [x0], #16
> +        ldp     x23, x24, [x0], #16
> +        ldp     x25, x26, [x0], #16
> +        ldp     x27, x28, [x0], #16
> +        ldp     x29, lr, [x0], #16
> +
> +        /* Restore stack pointer */
> +        ldr     x2, [x0], #8
> +        mov     sp, x2
> +
> +        /* Restore system control registers */
> +        ldr     x2, [x0], #8
> +        msr     VBAR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     VTCR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     VTTBR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     TPIDR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     MDCR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     HSTR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     CPTR_EL2, x2
> +        ldr     x2, [x0], #8
> +        msr     HCR_EL2, x2
> +        isb
> +
> +        /* Since context is restored return from this function will appear as
> +         * return from hyp_suspend. To distinguish a return from hyp_suspend
> +         * which is called upon finalizing the suspend, as opposed to return
> +         * from this function which executes on resume, we need to return zero
> +         * value here. */
> +        mov x0, #0
> +        ret
>  END(hyp_resume)
>  
>  #endif /* CONFIG_SYSTEM_SUSPEND */
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> index 55041a5d06..ae71ccb87b 100644
> --- a/xen/arch/arm/include/asm/suspend.h
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -5,9 +5,29 @@
>  
>  #ifdef CONFIG_SYSTEM_SUSPEND
>  
> +#ifdef CONFIG_ARM_64
> +struct cpu_context {
> +    register_t callee_regs[12];
> +    register_t sp;
> +    register_t vbar_el2;
> +    register_t vtcr_el2;
> +    register_t vttbr_el2;
> +    register_t tpidr_el2;
> +    register_t mdcr_el2;
> +    register_t hstr_el2;
> +    register_t cptr_el2;
> +    register_t hcr_el2;
> +} __aligned(16);
> +#else
> +#error "Define cpu_context structure for arm32"
> +#endif
> +
> +extern struct cpu_context cpu_context;
> +
>  int host_system_suspend(void);
>  
>  void hyp_resume(void);
> +int hyp_suspend(struct cpu_context *ptr);
>  
>  #endif /* CONFIG_SYSTEM_SUSPEND */
>  
> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> index 08b6acaede..b5398e5ca6 100644
> --- a/xen/arch/arm/suspend.c
> +++ b/xen/arch/arm/suspend.c
> @@ -1,6 +1,7 @@
>  /* SPDX-License-Identifier: GPL-2.0-only */
>  
>  #include <asm/psci.h>
> +#include <asm/suspend.h>
>  #include <xen/console.h>
>  #include <xen/cpu.h>
>  #include <xen/llc-coloring.h>
> @@ -17,6 +18,8 @@
>   *  - Investigate feasibility and need for implementing system suspend on ARM32
>   */
>  
> +struct cpu_context cpu_context;
> +
>  /* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
>  static long system_suspend(void *data)
>  {
> @@ -73,9 +76,23 @@ static long system_suspend(void *data)
>       */
>      update_boot_mapping(true);
>  
> -    status = call_psci_system_suspend();
> -    if ( status )
> -        dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
> +    if ( hyp_suspend(&cpu_context) )
> +    {
> +        status = call_psci_system_suspend();
> +        /*
> +         * If suspend is finalized properly by above system suspend PSCI call,
> +         * the code below in this 'if' branch will never execute. Execution
> +         * will continue from hyp_resume which is the hypervisor's resume point.
> +         * In hyp_resume CPU context will be restored and since link-register is
> +         * restored as well, it will appear to return from hyp_suspend. The
> +         * difference in returning from hyp_suspend on system suspend versus
> +         * resume is in function's return value: on suspend, the return value is
> +         * a non-zero value, on resume it is zero. That is why the control flow
> +         * will not re-enter this 'if' branch on resume.
> +         */

Looks like this comment is misplaced. It should be before "if (
hyp_suspend() )", right?

> +        if ( status )
> +            dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
> +    }
>  
>      system_state = SYS_STATE_resume;
>      update_boot_mapping(false);

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks
  2025-08-11 20:48 ` [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
@ 2025-08-23 17:48   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23 17:48 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Oleksandr Tyshchenko,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Mykola Kvach


Hi Mykola,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> Store and restore active context and micro-TLB registers.
>
> Tested on R-Car H3 Starter Kit.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
>  xen/drivers/passthrough/arm/ipmmu-vmsa.c | 269 +++++++++++++++++++++++
>  1 file changed, 269 insertions(+)
>
> diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> index dac0dd6d46..ced762657a 100644
> --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> @@ -58,6 +58,8 @@
>  #define dev_print(dev, lvl, fmt, ...)    \
>      printk(lvl "ipmmu: %s: " fmt, dev_name(dev), ## __VA_ARGS__)
>  
> +#define dev_dbg(dev, fmt, ...)    \
> +    dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__)
>  #define dev_info(dev, fmt, ...)    \
>      dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
>  #define dev_warn(dev, fmt, ...)    \
> @@ -117,6 +119,23 @@ struct ipmmu_features {
>      unsigned int imuctr_ttsel_mask;
>  };
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +struct hw_register {
> +    const char *reg_name;

Do you really need to store register name? It is not used anywhere.

> +    unsigned int reg_offset;
> +    unsigned int reg_data;
> +};
> +
> +struct ipmmu_vmsa_backup {
> +    struct device *dev;
> +    unsigned int *utlbs_val;
> +    unsigned int *asids_val;
> +    struct list_head list;
> +};
> +
> +#endif
> +
>  /* Root/Cache IPMMU device's information */
>  struct ipmmu_vmsa_device {
>      struct device *dev;
> @@ -129,6 +148,9 @@ struct ipmmu_vmsa_device {
>      struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
>      unsigned int utlb_refcount[IPMMU_UTLB_MAX];
>      const struct ipmmu_features *features;
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    struct hw_register *reg_backup[IPMMU_CTX_MAX];
> +#endif
>  };
>  
>  /*
> @@ -534,6 +556,235 @@ static void ipmmu_domain_free_context(struct ipmmu_vmsa_device *mmu,
>      spin_unlock_irqrestore(&mmu->lock, flags);
>  }
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +static DEFINE_SPINLOCK(ipmmu_devices_backup_lock);
> +static LIST_HEAD(ipmmu_devices_backup);
> +
> +#define HW_REGISTER_BACKUP_SIZE 4
> +
> +static struct hw_register root_pgtable[IPMMU_CTX_MAX][HW_REGISTER_BACKUP_SIZE] = {
> +    [0 ... (IPMMU_CTX_MAX - 1)] = {
> +        {"IMTTLBR0", IMTTLBR0, 0},
> +        {"IMTTUBR0", IMTTUBR0, 0},
> +        {"IMTTBCR",  IMTTBCR,  0},
> +        {"IMCTR",    IMCTR,    0},

Taking into account that only 4 registers needs to be saved, will it be
easier and more efficient to have a hardcoded struct like this?

struct ipmmu_reg_ctx {
    unsigned int imttlbr0;
    unsigned int imttubr0;
    unsigned int imttbcr;
    unsigned int imctr;
}

instead of hw_register[] ?

Especially taking into account that struct ipmmu_vmsa_backup{} does
exactly this.


> +    }
> +};
> +
> +static uint32_t ipmmu_imuasid_read(struct ipmmu_vmsa_device *mmu,
> +                                   unsigned int utlb)
> +{
> +    return ipmmu_read(mmu, ipmmu_utlb_reg(mmu, IMUASID(utlb)));
> +}
> +
> +static void ipmmu_utlbs_backup(struct ipmmu_vmsa_device *mmu)
> +{
> +    struct ipmmu_vmsa_backup *backup_data;
> +
> +    dev_dbg(mmu->dev, "Handle micro-TLBs backup\n");
> +
> +    spin_lock(&ipmmu_devices_backup_lock);
> +
> +    list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
> +    {
> +        struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
> +        unsigned int i;
> +
> +        if ( to_ipmmu(backup_data->dev) != mmu )
> +            continue;
> +
> +        for ( i = 0; i < fwspec->num_ids; i++ )
> +        {
> +            unsigned int utlb = fwspec->ids[i];
> +
> +            backup_data->asids_val[i] = ipmmu_imuasid_read(mmu, utlb);
> +            backup_data->utlbs_val[i] = ipmmu_imuctr_read(mmu, utlb);
> +        }
> +    }
> +
> +    spin_unlock(&ipmmu_devices_backup_lock);
> +}
> +
> +static void ipmmu_utlbs_restore(struct ipmmu_vmsa_device *mmu)
> +{
> +    struct ipmmu_vmsa_backup *backup_data;
> +
> +    dev_dbg(mmu->dev, "Handle micro-TLBs restore\n");
> +
> +    spin_lock(&ipmmu_devices_backup_lock);
> +
> +    list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
> +    {
> +        struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
> +        unsigned int i;
> +
> +        if ( to_ipmmu(backup_data->dev) != mmu )
> +            continue;
> +
> +        for ( i = 0; i < fwspec->num_ids; i++ )
> +        {
> +            unsigned int utlb = fwspec->ids[i];
> +
> +            ipmmu_imuasid_write(mmu, utlb, backup_data->asids_val[i]);
> +            ipmmu_imuctr_write(mmu, utlb, backup_data->utlbs_val[i]);
> +        }
> +    }
> +
> +    spin_unlock(&ipmmu_devices_backup_lock);
> +}
> +
> +static void ipmmu_domain_backup_context(struct ipmmu_vmsa_domain *domain)
> +{
> +    struct ipmmu_vmsa_device *mmu = domain->mmu->root;
> +    struct hw_register *reg = mmu->reg_backup[domain->context_id];
> +    unsigned int i;
> +
> +    dev_dbg(mmu->dev, "Handle domain context %u backup\n", domain->context_id);
> +
> +    for ( i = 0; i < HW_REGISTER_BACKUP_SIZE; i++ )
> +        reg[i].reg_data = ipmmu_ctx_read_root(domain, reg[i].reg_offset);
> +}
> +
> +static void ipmmu_domain_restore_context(struct ipmmu_vmsa_domain *domain)
> +{
> +    struct ipmmu_vmsa_device *mmu = domain->mmu->root;
> +    struct hw_register *reg = mmu->reg_backup[domain->context_id];
> +    unsigned int i;
> +
> +    dev_dbg(mmu->dev, "Handle domain context %u restore\n", domain->context_id);
> +
> +    for ( i = 0; i < HW_REGISTER_BACKUP_SIZE; i++ )
> +    {
> +        if ( reg[i].reg_offset != IMCTR )
> +            ipmmu_ctx_write_root(domain, reg[i].reg_offset, reg[i].reg_data);
> +        else
> +            ipmmu_ctx_write_all(domain, reg[i].reg_offset,
> +                                reg[i].reg_data | IMCTR_FLUSH);
> +    }
> +}
> +
> +/*
> + * Xen: Unlike Linux implementation, Xen uses a single driver instance
> + * for handling all IPMMUs. There is no framework for ipmmu_suspend/resume
> + * callbacks to be invoked for each IPMMU device. So, we need to iterate
> + * through all registered IPMMUs performing required actions.
> + *
> + * Also take care of restoring special settings, such as translation
> + * table format, etc.
> + */
> +static int __must_check ipmmu_suspend(void)
> +{
> +    struct ipmmu_vmsa_device *mmu;
> +
> +    if ( !iommu_enabled )
> +        return 0;
> +
> +    printk(XENLOG_DEBUG "ipmmu: Suspending ...\n");
> +
> +    spin_lock(&ipmmu_devices_lock);
> +
> +    list_for_each_entry( mmu, &ipmmu_devices, list )
> +    {
> +        if ( ipmmu_is_root(mmu) )
> +        {
> +            unsigned int i;
> +
> +            for ( i = 0; i < mmu->num_ctx; i++ )
> +            {
> +                if ( !mmu->domains[i] )
> +                    continue;
> +                ipmmu_domain_backup_context(mmu->domains[i]);
> +            }
> +        }
> +        else
> +            ipmmu_utlbs_backup(mmu);
> +    }
> +
> +    spin_unlock(&ipmmu_devices_lock);
> +
> +    return 0;
> +}
> +
> +static void ipmmu_resume(void)
> +{
> +    struct ipmmu_vmsa_device *mmu;
> +
> +    if ( !iommu_enabled )
> +        return;
> +
> +    printk(XENLOG_DEBUG "ipmmu: Resuming ...\n");
> +
> +    spin_lock(&ipmmu_devices_lock);
> +
> +    list_for_each_entry( mmu, &ipmmu_devices, list )
> +    {
> +        uint32_t reg;
> +
> +        /* Do not use security group function */
> +        reg = IMSCTLR + mmu->features->control_offset_base;
> +        ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) & ~IMSCTLR_USE_SECGRP);
> +
> +        if ( ipmmu_is_root(mmu) )
> +        {
> +            unsigned int i;
> +
> +            /* Use stage 2 translation table format */
> +            reg = IMSAUXCTLR + mmu->features->control_offset_base;
> +            ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) | IMSAUXCTLR_S2PTE);
> +
> +            for ( i = 0; i < mmu->num_ctx; i++ )
> +            {
> +                if ( !mmu->domains[i] )
> +                    continue;
> +                ipmmu_domain_restore_context(mmu->domains[i]);
> +            }
> +        }
> +        else
> +            ipmmu_utlbs_restore(mmu);
> +    }
> +
> +    spin_unlock(&ipmmu_devices_lock);
> +}
> +
> +static int ipmmu_alloc_ctx_suspend(struct device *dev)
> +{
> +    struct ipmmu_vmsa_backup *backup_data;
> +    unsigned int *utlbs_val, *asids_val;
> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +
> +    utlbs_val = xzalloc_array(unsigned int, fwspec->num_ids);
> +    if ( !utlbs_val )
> +        return -ENOMEM;
> +
> +    asids_val = xzalloc_array(unsigned int, fwspec->num_ids);
> +    if ( !asids_val )
> +    {
> +        xfree(utlbs_val);
> +        return -ENOMEM;
> +    }
> +
> +    backup_data = xzalloc(struct ipmmu_vmsa_backup);
> +    if ( !backup_data )
> +    {
> +        xfree(utlbs_val);
> +        xfree(asids_val);
> +        return -ENOMEM;
> +    }
> +
> +    backup_data->dev = dev;
> +    backup_data->utlbs_val = utlbs_val;
> +    backup_data->asids_val = asids_val;
> +
> +    spin_lock(&ipmmu_devices_backup_lock);
> +    list_add(&backup_data->list, &ipmmu_devices_backup);
> +    spin_unlock(&ipmmu_devices_backup_lock);
> +
> +    return 0;
> +}
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
>  static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
>  {
>      uint64_t ttbr;
> @@ -546,6 +797,9 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
>          return ret;
>  
>      domain->context_id = ret;
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    domain->mmu->root->reg_backup[ret] = root_pgtable[ret];
> +#endif
>  
>      /*
>       * TTBR0
> @@ -602,6 +856,9 @@ static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
>      ipmmu_ctx_write_root(domain, IMCTR, IMCTR_FLUSH);
>      ipmmu_tlb_sync(domain);
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    domain->mmu->root->reg_backup[domain->context_id] = NULL;
> +#endif
>      ipmmu_domain_free_context(domain->mmu->root, domain->context_id);
>  }
>  
> @@ -1307,6 +1564,14 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
>      /* Let Xen know that the master device is protected by an IOMMU. */
>      dt_device_set_protected(dev_to_dt(dev));
>  
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    if ( ipmmu_alloc_ctx_suspend(dev) )
> +    {
> +        dev_err(dev, "Failed to allocate context for suspend\n");
> +        return -ENOMEM;
> +    }
> +#endif
> +
>      dev_info(dev, "Added master device (IPMMU %s micro-TLBs %u)\n",
>               dev_name(fwspec->iommu_dev), fwspec->num_ids);
>  
> @@ -1372,6 +1637,10 @@ static const struct iommu_ops ipmmu_iommu_ops =
>      .unmap_page      = arm_iommu_unmap_page,
>      .dt_xlate        = ipmmu_dt_xlate,
>      .add_device      = ipmmu_add_device,
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +    .suspend         = ipmmu_suspend,
> +    .resume          = ipmmu_resume,
> +#endif
>  };
>  
>  static __init int ipmmu_init(struct dt_device_node *node, const void *data)

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume
  2025-08-11 20:48 ` [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume Mykola Kvach
@ 2025-08-23 17:54   ` Volodymyr Babchuk
  2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Volodymyr Babchuk @ 2025-08-23 17:54 UTC (permalink / raw)
  To: Mykola Kvach
  Cc: xen-devel@lists.xenproject.org, Oleksandr Tyshchenko,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Mykola Kvach, Oleksandr Andrushchenko

Hi,

Mykola Kvach <xakep.amatop@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> This is done using generic iommu_suspend/resume functions that cause
> IOMMU driver specific suspend/resume handlers to be called for enabled
> IOMMU (if one has suspend/resume driver handlers implemented).
>
> These handlers work only when IPMMU-VMSA is used; otherwise,
> we return an error during system suspend requests.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> Tested-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
>  xen/arch/arm/suspend.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> index b5398e5ca6..cb86426ebd 100644
> --- a/xen/arch/arm/suspend.c
> +++ b/xen/arch/arm/suspend.c
> @@ -4,6 +4,7 @@
>  #include <asm/suspend.h>
>  #include <xen/console.h>
>  #include <xen/cpu.h>
> +#include <xen/iommu.h>
>  #include <xen/llc-coloring.h>
>  #include <xen/sched.h>
>  
> @@ -49,6 +50,13 @@ static long system_suspend(void *data)
>  
>      time_suspend();
>  
> +    status = iommu_suspend();
> +    if ( status )
> +    {
> +        system_state = SYS_STATE_resume;
> +        goto resume_time;
> +    }
> +
>      local_irq_save(flags);
>      status = gic_suspend();
>      if ( status )
> @@ -105,6 +113,10 @@ static long system_suspend(void *data)
>  
>   resume_irqs:
>      local_irq_restore(flags);
> +
> +    iommu_resume();
> +
> + resume_time:
>      time_resume();
>  
>   resume_nonboot_cpus:
> @@ -140,6 +152,15 @@ int host_system_suspend(void)
>          return -ENOSYS;
>      }
>  
> +    /* TODO: drop check once suspend/resume support for SMMU is implemented */
> +#ifndef CONFIG_IPMMU_VMSA

This check is meaningless, as you can have CONFIG_IPMMU_VMSA enabled
along with CONFIG_ARM_SMMU on a system with SMMU. I think it is enough
that iommu_suspend() will fail during system_suspend(), isn't it?


> +    if ( iommu_enabled )
> +    {
> +        dprintk(XENLOG_ERR, "IOMMU is enabled, suspend not supported yet\n");
> +        return -ENOSYS;
> +    }
> +#endif
> +
>      /*
>       * system_suspend should be called when Dom0 finalizes the suspend
>       * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions
  2025-08-23  0:01   ` Volodymyr Babchuk
@ 2025-08-26 13:41     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:41 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach

Hi Volodymyr,

On Sat, Aug 23, 2025 at 3:01 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > System suspend may lead to a state where GIC would be powered down.
> > Therefore, Xen should save/restore the context of GIC on suspend/resume.
> >
> > Note that the context consists of states of registers which are
> > controlled by the hypervisor. Other GIC registers which are accessible
> > by guests are saved/restored on context switch.
> >
> > Tested on Xilinx Ultrascale+ MPSoC with (and without) powering down
> > the GIC.
>
> I think you can drop this sentence, as I am pretty sure this patch
> wasn't tested on Ultrascale+ for last five(?) years..

Ok, I’ll drop that sentence.

>
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in v4:
> >   - Add error logging for allocation failures
> >
> > Changes in v3:
> >   - Drop asserts and return error codes instead.
> >   - Wrap code with CONFIG_SYSTEM_SUSPEND.
> >
> > Changes in v2:
> >   - Minor fixes after review.
> > ---
> >  xen/arch/arm/gic-v2.c          | 154 +++++++++++++++++++++++++++++++++
> >  xen/arch/arm/gic.c             |  29 +++++++
> >  xen/arch/arm/include/asm/gic.h |  12 +++
> >  3 files changed, 195 insertions(+)
> >
> > diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> > index b23e72a3d0..dce8f5e924 100644
> > --- a/xen/arch/arm/gic-v2.c
> > +++ b/xen/arch/arm/gic-v2.c
> > @@ -1098,6 +1098,151 @@ static int gicv2_iomem_deny_access(struct domain *d)
> >      return iomem_deny_access(d, mfn, mfn + nr);
> >  }
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +/* GICv2 registers to be saved/restored on system suspend/resume */
> > +struct gicv2_context {
> > +    /* GICC context */
> > +    uint32_t gicc_ctlr;
> > +    uint32_t gicc_pmr;
> > +    uint32_t gicc_bpr;
> > +    /* GICD context */
> > +    uint32_t gicd_ctlr;
> > +    uint32_t *gicd_isenabler;
> > +    uint32_t *gicd_isactiver;
> > +    uint32_t *gicd_ipriorityr;
> > +    uint32_t *gicd_itargetsr;
> > +    uint32_t *gicd_icfgr;
> > +};
> > +
> > +static struct gicv2_context gicv2_context;
> > +
> > +static int gicv2_suspend(void)
> > +{
> > +    unsigned int i;
> > +
> > +    if ( !gicv2_context.gicd_isenabler )
> > +    {
> > +        dprintk(XENLOG_WARNING, "%s:%d: GICv2 suspend context not allocated!\n",
> > +            __func__, __LINE__);
>
> Should this be ASSERT(gicv2_context.gicd_isenabler) ? Or do we expect
> that this can happen during normal runtime?
>
> Also, you don't need to print __func__ and __LINE__ as dprintk does this
> for you already:
>
> #define dprintk(lvl, fmt, args...) \
>     printk(lvl "%s:%d: " fmt, __FILE__, __LINE__, ## args)
>
>
>
> > +        return -ENOMEM;
> > +    }
> > +
> > +    /* Save GICC configuration */
> > +    gicv2_context.gicc_ctlr = readl_gicc(GICC_CTLR);
> > +    gicv2_context.gicc_pmr = readl_gicc(GICC_PMR);
> > +    gicv2_context.gicc_bpr = readl_gicc(GICC_BPR);
> > +
> > +    /* Save GICD configuration */
> > +    gicv2_context.gicd_ctlr = readl_gicd(GICD_CTLR);
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
> > +        gicv2_context.gicd_isenabler[i] = readl_gicd(GICD_ISENABLER + i * 4);
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
> > +        gicv2_context.gicd_isactiver[i] = readl_gicd(GICD_ISACTIVER + i * 4);
> > +
>
> I think you can merge this two loops

Ok, I'll do it

>
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> > +        gicv2_context.gicd_ipriorityr[i] = readl_gicd(GICD_IPRIORITYR + i * 4);
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> > +        gicv2_context.gicd_itargetsr[i] = readl_gicd(GICD_ITARGETSR + i * 4);
> > +
>
> And this two as well

Ok

>
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
> > +        gicv2_context.gicd_icfgr[i] = readl_gicd(GICD_ICFGR + i * 4);
> > +
> > +    return 0;
> > +}
> > +
> > +static void gicv2_resume(void)
> > +{
> > +    unsigned int i;
> > +
> > +    if ( !gicv2_context.gicd_isenabler )
> > +    {
> > +        dprintk(XENLOG_WARNING, "%s:%d: GICv2 suspend context not allocated!\n",
> > +            __func__, __LINE__);
>
> the same comment as for gicv2_suspend(). Actually, we should not reach
> here in any case.

You mean we should just drop the check, right?

>
> > +        return;
> > +    }
> > +
> > +    gicv2_cpu_disable();
> > +    /* Disable distributor */
> > +    writel_gicd(0, GICD_CTLR);
> > +
> > +    /* Restore GICD configuration */
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
> > +        writel_gicd(0xffffffff, GICD_ICENABLER + i * 4);
> > +        writel_gicd(gicv2_context.gicd_isenabler[i], GICD_ISENABLER + i * 4);
> > +    }
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
> > +        writel_gicd(0xffffffff, GICD_ICACTIVER + i * 4);
> > +        writel_gicd(gicv2_context.gicd_isactiver[i], GICD_ISACTIVER + i * 4);
> > +    }
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> > +        writel_gicd(gicv2_context.gicd_ipriorityr[i], GICD_IPRIORITYR + i * 4);
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
> > +        writel_gicd(gicv2_context.gicd_itargetsr[i], GICD_ITARGETSR + i * 4);
> > +
> > +    for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
> > +        writel_gicd(gicv2_context.gicd_icfgr[i], GICD_ICFGR + i * 4);
> > +
> > +    /* Make sure all registers are restored and enable distributor */
> > +    writel_gicd(gicv2_context.gicd_ctlr | GICD_CTL_ENABLE, GICD_CTLR);
> > +
> > +    /* Restore GIC CPU interface configuration */
> > +    writel_gicc(gicv2_context.gicc_pmr, GICC_PMR);
> > +    writel_gicc(gicv2_context.gicc_bpr, GICC_BPR);
> > +
> > +    /* Enable GIC CPU interface */
> > +    writel_gicc(gicv2_context.gicc_ctlr | GICC_CTL_ENABLE | GICC_CTL_EOI,
> > +                GICC_CTLR);
> > +}
> > +
> > +static void gicv2_alloc_context(struct gicv2_context *gc)
> > +{
> > +    uint32_t n = gicv2_info.nr_lines;
> > +
> > +    gc->gicd_isenabler = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
> > +    if ( !gc->gicd_isenabler )
> > +        goto err_free;
> > +
> > +    gc->gicd_isactiver = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
> > +    if ( !gc->gicd_isactiver )
> > +        goto err_free;
> > +
> > +    gc->gicd_itargetsr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
> > +    if ( !gc->gicd_itargetsr )
> > +        goto err_free;
> > +
> > +    gc->gicd_ipriorityr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
> > +    if ( !gc->gicd_ipriorityr )
> > +        goto err_free;
> > +
> > +    gc->gicd_icfgr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 16));
> > +    if ( !gc->gicd_icfgr )
> > +        goto err_free;
> > +
> > +    return;
> > +
> > + err_free:
> > +    dprintk(XENLOG_ERR,
> > +            "%s:%d: failed to allocate memory for GICv2 suspend context\n",
> > +            __func__, __LINE__);
>
> Okay, this partially answers to my previous question about memory
> allocation. So it is possible to have active system without space for
> GIC context... But this basically renders system un-suspendable. I think
> this warrants non-debug print to warn user that suspend will be not
> available.

Ok, I’ll add a regular print instead. So your comments about asserts are
irrelevant in this case, right?

>
> [...]
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 suspend/resume functions
  2025-08-23  0:20   ` Volodymyr Babchuk
@ 2025-08-26 13:41     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:41 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel

Hi Volodymyr,

On Sat, Aug 23, 2025 at 3:20 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > System suspend may lead to a state where GIC would be powered down.
> > Therefore, Xen should save/restore the context of GIC on suspend/resume.
> >
> > Note that the context consists of states of registers which are
> > controlled by the hypervisor. Other GIC registers which are accessible
> > by guests are saved/restored on context switch.
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> >  xen/arch/arm/gic-v3.c | 233 ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 233 insertions(+)
> >
> > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> > index cd3e1acf79..a9b65ff5d4 100644
> > --- a/xen/arch/arm/gic-v3.c
> > +++ b/xen/arch/arm/gic-v3.c
> > @@ -1776,6 +1776,231 @@ static bool gic_dist_supports_lpis(void)
> >      return (readl_relaxed(GICD + GICD_TYPER) & GICD_TYPE_LPIS);
> >  }
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +/* GICv3 registers to be saved/restored on system suspend/resume */
> > +struct gicv3_ctx {
> > +    struct dist_ctx {
> > +        uint32_t ctlr;
> > +        /*
> > +         * This struct represent block of 32 IRQs
> > +         * TODO: store extended SPI configuration (GICv3.1+)
> > +         */
> > +        struct irq_regs {
> > +            uint32_t icfgr[2];
> > +            uint32_t ipriorityr[8];
> > +            uint64_t irouter[32];
> > +            uint32_t isactiver;
> > +            uint32_t isenabler;
> > +        } *irqs;
> > +    } dist;
> > +
> > +    /* have only one rdist structure for last running CPU during suspend */
> > +    struct redist_ctx {
> > +        uint32_t ctlr;
> > +        /* TODO: handle case when we have more than 16 PPIs (GICv3.1+) */
> > +        uint32_t icfgr[2];
> > +        uint32_t igroupr;
> > +        uint32_t ipriorityr[8];
> > +        uint32_t isactiver;
> > +        uint32_t isenabler;
> > +    } rdist;
> > +
> > +    struct cpu_ctx {
> > +        uint32_t ctlr;
> > +        uint32_t pmr;
> > +        uint32_t bpr;
> > +        uint32_t sre_el2;
> > +        uint32_t grpen;
> > +    } cpu;
> > +};
> > +
> > +static struct gicv3_ctx gicv3_ctx;
> > +
> > +static void __init gicv3_alloc_context(void)
> > +{
> > +    uint32_t blocks = DIV_ROUND_UP(gicv3_info.nr_lines, 32);
> > +
> > +    if ( gicv3_its_host_has_its() )
> > +        return;
>
> I think this needs a comment at least. And/or printk() message. Because
> for it is unclear why we are doing nothing if host has ITS

Got it, I'll add log message

>
> > +
> > +    /* according to spec it is possible don't have SPIs */
> > +    if ( blocks == 1 )
> > +        return;
> > +
> > +    gicv3_ctx.dist.irqs = xzalloc_array(typeof(*gicv3_ctx.dist.irqs), blocks - 1);
> > +    if ( !gicv3_ctx.dist.irqs )
> > +        dprintk(XENLOG_ERR,
> > +                "%s:%d: failed to allocate memory for GICv3 suspend context\n",
> > +                __func__, __LINE__);
>
> dprintk() already prints function and line. Here and everywhere in this
> patch.

Thanks for noticing this. I’ll update the code accordingly.

>
> > +}
> > +
> > +static void gicv3_disable_redist(void)
> > +{
> > +    void __iomem* waker = GICD_RDIST_BASE + GICR_WAKER;
> > +
> > +    writel_relaxed(readl_relaxed(waker) | GICR_WAKER_ProcessorSleep, waker);
> > +    while ( (readl_relaxed(waker) & GICR_WAKER_ChildrenAsleep) == 0 );
> > +}
> > +
> > +static int gicv3_suspend(void)
> > +{
> > +    unsigned int i;
> > +    void __iomem *base;
> > +    typeof(gicv3_ctx.rdist)* rdist = &gicv3_ctx.rdist;
> > +
> > +    /* TODO: implement support for ITS */
> > +    if ( gicv3_its_host_has_its() )
> > +        return -EOPNOTSUPP;
> > +
> > +    if ( !gicv3_ctx.dist.irqs && gicv3_info.nr_lines > NR_GIC_LOCAL_IRQS )
> > +    {
> > +        dprintk(XENLOG_WARNING,
> > +                "%s:%d: GICv3 suspend context is not allocated!\n",
> > +                __func__, __LINE__);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    gicv3_save_state(current);
> > +
> > +    /* Save GICC configuration */
> > +    gicv3_ctx.cpu.ctlr     = READ_SYSREG(ICC_CTLR_EL1);
> > +    gicv3_ctx.cpu.pmr      = READ_SYSREG(ICC_PMR_EL1);
> > +    gicv3_ctx.cpu.bpr      = READ_SYSREG(ICC_BPR1_EL1);
> > +    gicv3_ctx.cpu.sre_el2  = READ_SYSREG(ICC_SRE_EL2);
> > +    gicv3_ctx.cpu.grpen    = READ_SYSREG(ICC_IGRPEN1_EL1);
> > +
> > +    gicv3_disable_interface();
> > +    gicv3_disable_redist();
> > +
> > +    /* Save GICR configuration */
> > +    gicv3_redist_wait_for_rwp();
> > +
> > +    base = GICD_RDIST_SGI_BASE;
> > +
> > +    rdist->ctlr = readl_relaxed(base + GICR_CTLR);
> > +
> > +    /* Set priority on PPI and SGI interrupts */
>
> Probably you wanted to say "Save priority..."

Yes, thank you!
I forgot to change it after copy-pasting.

>
> > +    for (i = 0; i < NR_GIC_LOCAL_IRQS / 4; i += 4)
> > +        rdist->ipriorityr[i] = readl_relaxed(base + GICR_IPRIORITYR0 + 4 * i);
>
> Is this correct? You are writing to every 4th rdist->ipriorityr and
> reading every 4th GICR_IPRIORITYR<n>

Definitely not -- thank you for catching this!

>
> > +
> > +    rdist->isactiver = readl_relaxed(base + GICR_ISACTIVER0);
> > +    rdist->isenabler = readl_relaxed(base + GICR_ISENABLER0);
> > +    rdist->igroupr   = readl_relaxed(base + GICR_IGROUPR0);
> > +    rdist->icfgr[0]  = readl_relaxed(base + GICR_ICFGR0);
> > +    rdist->icfgr[1]  = readl_relaxed(base + GICR_ICFGR1);
> > +
> > +    /* Save GICD configuration */
> > +    gicv3_dist_wait_for_rwp();
> > +    gicv3_ctx.dist.ctlr = readl_relaxed(GICD + GICD_CTLR);
> > +
> > +    for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> > +    {
> > +        typeof(gicv3_ctx.dist.irqs) irqs = gicv3_ctx.dist.irqs + i - 1;
> > +        unsigned int irq;
> > +
> > +        base = GICD + GICD_ICFGR + 8 * i;
> > +        irqs->icfgr[0] = readl_relaxed(base);
> > +        irqs->icfgr[1] = readl_relaxed(base + 4);
> > +
> > +        base = GICD + GICD_IPRIORITYR + 32 * i;
> > +        for ( irq = 0; irq < 8; irq++ )
> > +            irqs->ipriorityr[irq] = readl_relaxed(base + 4 * irq);
> > +
> > +        base = GICD + GICD_IROUTER + 32 * i;
> > +        for ( irq = 0; irq < 32; irq++ )
> > +            irqs->irouter[irq] = readq_relaxed_non_atomic(base + 8 * irq);
> > +
> > +        irqs->isactiver = readl_relaxed(GICD + GICD_ISACTIVER + 4 * i);
> > +        irqs->isenabler = readl_relaxed(GICD + GICD_ISENABLER + 4 * i);
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static void gicv3_resume(void)
> > +{
> > +    unsigned int i;
> > +    void __iomem *base;
> > +    typeof(gicv3_ctx.rdist)* rdist = &gicv3_ctx.rdist;
> > +
> > +    if ( !gicv3_ctx.dist.irqs && gicv3_info.nr_lines > NR_GIC_LOCAL_IRQS )
> > +    {
> > +        dprintk(XENLOG_WARNING, "%s:%d: GICv3 suspend context not allocated!\n",
> > +            __func__, __LINE__);
> > +        return;
> > +    }
> > +
> > +    writel_relaxed(0, GICD + GICD_CTLR);
> > +
> > +    for ( i = NR_GIC_LOCAL_IRQS; i < gicv3_info.nr_lines; i += 32 )
> > +        writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPR + (i / 32) * 4);
> > +
> > +    for ( i = 1; i < DIV_ROUND_UP(gicv3_info.nr_lines, 32); i++ )
> > +    {
> > +        typeof(gicv3_ctx.dist.irqs) irqs = gicv3_ctx.dist.irqs + i - 1;
> > +        unsigned int irq;
> > +
> > +        base = GICD + GICD_ICFGR + 8 * i;
> > +        writel_relaxed(irqs->icfgr[0], base);
> > +        writel_relaxed(irqs->icfgr[1], base + 4);
> > +
> > +        base = GICD + GICD_IPRIORITYR + 32 * i;
> > +        for ( irq = 0; irq < 8; irq++ )
> > +            writel_relaxed(irqs->ipriorityr[irq], base + 4 * irq );
>
> style: space before )

I'll fix it, thank you

>
> > +
> > +        base = GICD + GICD_IROUTER + 32 * i;
> > +        for ( irq = 0; irq < 32; irq++ )
> > +            writeq_relaxed_non_atomic(irqs->irouter[irq], base + 8 * irq);
> > +
> > +        writel_relaxed(irqs->isenabler, GICD + GICD_ISENABLER + i * 4);
> > +        writel_relaxed(irqs->isactiver, GICD + GICD_ISACTIVER + i * 4);
> > +    }
> > +
> > +    writel_relaxed(gicv3_ctx.dist.ctlr, GICD + GICD_CTLR);
> > +    gicv3_dist_wait_for_rwp();
> > +
> > +    /* Restore GICR (Redistributor) configuration */
> > +    gicv3_enable_redist();
> > +
> > +    base = GICD_RDIST_SGI_BASE;
> > +
> > +    writel_relaxed(0xffffffff, base + GICR_ICENABLER0);
> > +    gicv3_redist_wait_for_rwp();
> > +
> > +    for (i = 0; i < NR_GIC_LOCAL_IRQS / 4; i += 4)
> > +        writel_relaxed(rdist->ipriorityr[i], base + GICR_IPRIORITYR0 + i * 4);
>
> Is this correct? You are writing to every 4th GICR_IPRIORITYR<n>

Definitely not -- thank you for catching this!

>
> > +
> > +    writel_relaxed(rdist->isactiver, base + GICR_ISACTIVER0);
> > +
> > +    writel_relaxed(rdist->igroupr,  base + GICR_IGROUPR0);
> > +    writel_relaxed(rdist->icfgr[0], base + GICR_ICFGR0);
> > +    writel_relaxed(rdist->icfgr[1], base + GICR_ICFGR1);
> > +
> > +    gicv3_redist_wait_for_rwp();
> > +
> > +    writel_relaxed(rdist->isenabler, base + GICR_ISENABLER0);
> > +    writel_relaxed(rdist->ctlr, GICD_RDIST_BASE + GICR_CTLR);
> > +
> > +    gicv3_redist_wait_for_rwp();
> > +
> > +    WRITE_SYSREG(gicv3_ctx.cpu.sre_el2, ICC_SRE_EL2);
> > +    isb();
> > +
> > +    /* Restore CPU interface (System registers) */
> > +    WRITE_SYSREG(gicv3_ctx.cpu.pmr,   ICC_PMR_EL1);
> > +    WRITE_SYSREG(gicv3_ctx.cpu.bpr,   ICC_BPR1_EL1);
> > +    WRITE_SYSREG(gicv3_ctx.cpu.ctlr,  ICC_CTLR_EL1);
> > +    WRITE_SYSREG(gicv3_ctx.cpu.grpen, ICC_IGRPEN1_EL1);
> > +    isb();
> > +
> > +    gicv3_hyp_init();
> > +
> > +    gicv3_restore_state(current);
> > +}
> > +
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
> > +
> >  /* Set up the GIC */
> >  static int __init gicv3_init(void)
> >  {
> > @@ -1850,6 +2075,10 @@ static int __init gicv3_init(void)
> >
> >      gicv3_hyp_init();
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    gicv3_alloc_context();
> > +#endif
> > +
> >  out:
> >      spin_unlock(&gicv3.lock);
> >
> > @@ -1889,6 +2118,10 @@ static const struct gic_hw_operations gicv3_ops = {
> >  #endif
> >      .iomem_deny_access   = gicv3_iomem_deny_access,
> >      .do_LPI              = gicv3_do_LPI,
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    .suspend             = gicv3_suspend,
> > +    .resume              = gicv3_resume,
> > +#endif
> >  };
> >
> >  static int __init gicv3_dt_preinit(struct dt_device_node *node, const void *data)
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend
  2025-08-23  0:36   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel

Hi Volodymyr,

On Sat, Aug 23, 2025 at 3:36 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> While I approve the change, the commit message is somewhat
> unclear. Maybe "Don't release IRQs on suspend" will be better?

Do you mean commit message title ?

>
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > If we call disable_nonboot_cpus on ARM64 with system_state set
> > to SYS_STATE_suspend, the following assertion will be triggered:
> >
> > ```
> > (XEN) [   25.582712] Disabling non-boot CPUs ...
> > (XEN) [   25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
> > [...]
> > (XEN) [   25.975069] Xen call trace:
> > (XEN) [   25.978353]    [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
> > (XEN) [   25.984314]    [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
> > (XEN) [   25.990276]    [<00000a00002747d4>] release_irq+0xe4/0xe8
> > (XEN) [   25.996152]    [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
> > (XEN) [   26.003150]    [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
> > (XEN) [   26.009717]    [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
> > (XEN) [   26.017148]    [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
> > (XEN) [   26.023801]    [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
> > (XEN) [   26.030281]    [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
> > (XEN) [   26.038057]    [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
> > (XEN) [   26.045229]    [<00000a00002268a4>] do_tasklet+0x68/0xb0
> > (XEN) [   26.051018]    [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
> > (XEN) [   26.057585]    [<00000a0000277e30>] start_secondary+0x21c/0x220
> > (XEN) [   26.063978]    [<00000a0000361258>] 00000a0000361258
> > ```
> >
> > This happens because before invoking take_cpu_down via the stop_machine_run
> > function on the target CPU, stop_machine_run requests
> > the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
> > the release_irq function then triggers the assertion:
> >
> > /*
> >  * Heap allocations may need TLB flushes which may require IRQs to be
> >  * enabled (except when only 1 PCPU is online).
> >  */
> >
> > This patch adds system state checks to guard calls to request_irq
> > and release_irq. These calls are now skipped when system_state is
> > SYS_STATE_{resume,suspend}, preventing unsafe operations during
> > suspend/resume handling.
>
> If any call to release_irq() during suspend will trigger ASSERT, and it
> is fine to leave IRQs as is during suspend, maybe it will be easier to
> put
>
> +        if ( system_state == SYS_STATE_suspend )
> +            return;
>
> straight into release_irq() code? This will be easier than playing
> whack-a-mole when some other patch will add another release_irq() call
> somewhere.

I’m fine with adding this check directly into release_irq(), as long as
the other maintainers agree with this approach.

>
>
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V4:
> >   - removed the prior tasklet-based workaround in favor of a more
> >     straightforward and safer solution
> >   - reworked the approach by adding explicit system state checks around
> >     request_irq and release_irq calls, skips these calls during suspend
> >     and resume states to avoid unsafe memory operations when IRQs are
> >     disabled
> > ---
> >  xen/arch/arm/gic.c           |  6 ++++++
> >  xen/arch/arm/tee/ffa_notif.c |  2 +-
> >  xen/arch/arm/time.c          | 18 ++++++++++++------
> >  3 files changed, 19 insertions(+), 7 deletions(-)
> >
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index a018bd7715..9856cb1592 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -388,6 +388,9 @@ void gic_dump_info(struct vcpu *v)
> >
> >  void init_maintenance_interrupt(void)
> >  {
> > +    if ( system_state == SYS_STATE_resume )
> > +        return;
> > +
> >      request_irq(gic_hw_ops->info->maintenance_irq, 0, maintenance_interrupt,
> >                  "irq-maintenance", NULL);
> >  }
> > @@ -461,6 +464,9 @@ static int cpu_gic_callback(struct notifier_block *nfb,
> >      switch ( action )
> >      {
> >      case CPU_DYING:
> > +        if ( system_state == SYS_STATE_suspend )
> > +            break;
> > +
> >          /* This is reverting the work done in init_maintenance_interrupt */
> >          release_irq(gic_hw_ops->info->maintenance_irq, NULL);
> >          break;
> > diff --git a/xen/arch/arm/tee/ffa_notif.c b/xen/arch/arm/tee/ffa_notif.c
> > index 00efaf8f73..06f715a82b 100644
> > --- a/xen/arch/arm/tee/ffa_notif.c
> > +++ b/xen/arch/arm/tee/ffa_notif.c
> > @@ -347,7 +347,7 @@ void ffa_notif_init_interrupt(void)
> >  {
> >      int ret;
> >
> > -    if ( notif_enabled && notif_sri_irq < NR_GIC_SGI )
> > +    if ( notif_enabled && notif_sri_irq < NR_GIC_SGI && system_state != SYS_STATE_resume )
> >      {
> >          /*
> >           * An error here is unlikely since the primary CPU has already
> > diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
> > index ad984fdfdd..b2e07ade43 100644
> > --- a/xen/arch/arm/time.c
> > +++ b/xen/arch/arm/time.c
> > @@ -320,10 +320,13 @@ void init_timer_interrupt(void)
> >      WRITE_SYSREG(CNTHCTL_EL2_EL1PCTEN, CNTHCTL_EL2);
> >      disable_physical_timers();
> >
> > -    request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
> > -                "hyptimer", NULL);
> > -    request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt,
> > -                   "virtimer", NULL);
> > +    if ( system_state != SYS_STATE_resume )
> > +    {
> > +        request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt,
> > +                    "hyptimer", NULL);
> > +        request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt,
> > +                    "virtimer", NULL);
> > +    }
> >
> >      check_timer_irq_cfg(timer_irq[TIMER_HYP_PPI], "hypervisor");
> >      check_timer_irq_cfg(timer_irq[TIMER_VIRT_PPI], "virtual");
> > @@ -338,8 +341,11 @@ static void deinit_timer_interrupt(void)
> >  {
> >      disable_physical_timers();
> >
> > -    release_irq(timer_irq[TIMER_HYP_PPI], NULL);
> > -    release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
> > +    if ( system_state != SYS_STATE_suspend )
> > +    {
> > +        release_irq(timer_irq[TIMER_HYP_PPI], NULL);
> > +        release_irq(timer_irq[TIMER_VIRT_PPI], NULL);
> > +    }
> >  }
> >
> >  /* Wait a set number of microseconds */
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during system resume
  2025-08-23  0:45   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mykola Kvach, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel

Hi Volodymyr,

On Sat, Aug 23, 2025 at 3:45 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> > and not restored by gic_resume (for secondary cpus).
> >
> > This patch introduces restore_local_irqs_on_resume, a function that
> > restores the state of local interrupts on the target CPU during
> > system resume.
> >
> > It iterates over all local IRQs and re-enables those that were not
> > disabled, reprogramming their routing and affinity accordingly.
> >
> > The function is invoked from start_secondary, ensuring that local IRQ
> > state is restored early during CPU bring-up after suspend.
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> >  xen/arch/arm/irq.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 44 insertions(+)
> >
> > diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> > index 148f184f8b..b3ff38dc27 100644
> > --- a/xen/arch/arm/irq.c
> > +++ b/xen/arch/arm/irq.c
> > @@ -113,6 +113,47 @@ static int init_local_irq_data(unsigned int cpu)
> >      return 0;
> >  }
> >
> > +/*
> > + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> > + * so call this function on the target CPU to restore them.
> > + *
> > + * SPIs are restored via gic_resume.
> > + */
> > +static void restore_local_irqs_on_resume(void)
> > +{
> > +    int irq;
> > +
> > +    if ( system_state != SYS_STATE_resume )
> > +        return;
>
> Maybe it is better to move this check to restore_local_irqs_on_resume()
> caller? You can put ASSERT(system_state == SYS_STATE_resume) there
> instead.
>
> I am saying this because name of restore_local_irqs_on_resume() implies that it
> should be called only on resume.

Not a problem.
I'll move checking outside the function.

>
> > +
> > +    spin_lock(&local_irqs_type_lock);
> > +
> > +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> > +    {
> > +        struct irq_desc *desc = irq_to_desc(irq);
> > +
> > +        spin_lock(&desc->lock);
> > +
> > +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> > +        {
> > +            spin_unlock(&desc->lock);
> > +            continue;
> > +        }
> > +
> > +        /* it is needed to avoid asserts in below calls */
> > +        set_bit(_IRQ_DISABLED, &desc->status);
>
> Assert in the call below isn't just because. You need to either call
> desc->handler->disable() to properly disable the IRQ or justify why it
> is fine to just set the bit.

Got it. I’ll call the disable handler here instead.

>
> > +
> > +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
> > +
> > +        /* _IRQ_DISABLED is cleared by below call */
> > +        desc->handler->startup(desc);
> > +
> > +        spin_unlock(&desc->lock);
> > +    }
> > +
> > +    spin_unlock(&local_irqs_type_lock);
> > +}
> > +
> >  static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >                          void *hcpu)
> >  {
> > @@ -131,6 +172,9 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >              printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
> >                     cpu);
> >          break;
> > +    case CPU_STARTING:
> > +        restore_local_irqs_on_resume();
> > +        break;
> >      }
> >
> >      return notifier_from_errno(rc);
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain
  2025-08-23  1:00   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach

Hi Volodymyr,

On Sat, Aug 23, 2025 at 4:00 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > Trigger Xen suspend when the hardware domain initiates suspend via
> > SHUTDOWN_suspend. Redirect system suspend to CPU#0 to ensure the
> > suspend logic runs on the boot CPU, as required.
> >
> > Introduce full suspend/resume infrastructure gated by CONFIG_SYSTEM_SUSPEND,
> > including logic to:
> >  - disable and enable non-boot physical CPUs
> >  - freeze and thaw domains
> >  - suspend and resume the GIC, timer and console
> >  - maintain system state before and after suspend
> >
> > Remove the restriction in the PSCI interface preventing suspend from the
> > hardware domain.
> >
> > Select HAS_SYSTEM_SUSPEND for ARM_64.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in v5:
> > - select HAS_SYSTEM_SUSPEND in ARM_64 instead of ARM
> > - check llc_coloring_enabled instead of LLC_COLORING during the selection
> >   of HAS_SYSTEM_SUSPEND config
> > - call host_system_suspend from guest PSCI system suspend instead of
> >   arch_domain_shutdown, reducing the complexity of the new code
> > - update some comments
> >
> > Changes introduced in V4:
> >   - drop code for saving and restoring VCPU context (for more information
> >     refer part 1 of patch series)
> >   - remove IOMMU suspend and resume calls until they will be implemented
> >   - move system suspend logic to arch_domain_shutdown, invoked from
> >     domain_shutdown
> >   - apply minor fixes such as renaming functions, updating comments, and
> >     modifying the commit message to reflect these changes
> >   - add console_end_sync to resume path after system suspend
> >
> > Changes introduced in V3:
> >   - merge changes from other commits into this patch (previously stashed):
> >     1) disable/enable non-boot physical CPUs and freeze/thaw domains during
> >        suspend/resume
> >     2) suspend/resume the timer, GIC, console, IOMMU, and hardware domain
> > ---
> >  xen/arch/arm/Kconfig               |   1 +
> >  xen/arch/arm/Makefile              |   1 +
> >  xen/arch/arm/include/asm/suspend.h |  22 +++++
> >  xen/arch/arm/suspend.c             | 151 +++++++++++++++++++++++++++++
> >  xen/arch/arm/vpsci.c               |  12 ++-
> >  xen/common/domain.c                |   4 +
> >  6 files changed, 190 insertions(+), 1 deletion(-)
> >  create mode 100644 xen/arch/arm/include/asm/suspend.h
> >  create mode 100644 xen/arch/arm/suspend.c
> >
> > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > index a0c8160474..ccdbaa5bc3 100644
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -8,6 +8,7 @@ config ARM_64
> >       depends on !ARM_32
> >       select 64BIT
> >       select HAS_FAST_MULTIPLY
> > +     select HAS_SYSTEM_SUSPEND if UNSUPPORTED
> >       select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH
> >
> >  config ARM
> > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> > index f833cdf207..3f6247adee 100644
> > --- a/xen/arch/arm/Makefile
> > +++ b/xen/arch/arm/Makefile
> > @@ -51,6 +51,7 @@ obj-y += setup.o
> >  obj-y += shutdown.o
> >  obj-y += smp.o
> >  obj-y += smpboot.o
> > +obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
> >  obj-$(CONFIG_SYSCTL) += sysctl.o
> >  obj-y += time.o
> >  obj-y += traps.o
> > diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> > new file mode 100644
> > index 0000000000..78d0e2383b
> > --- /dev/null
> > +++ b/xen/arch/arm/include/asm/suspend.h
> > @@ -0,0 +1,22 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#ifndef __ASM_ARM_SUSPEND_H__
> > +#define __ASM_ARM_SUSPEND_H__
> > +
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +int host_system_suspend(void);
> > +
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
> > +
> > +#endif /* __ASM_ARM_SUSPEND_H__ */
> > +
> > + /*
> > +  * Local variables:
> > +  * mode: C
> > +  * c-file-style: "BSD"
> > +  * c-basic-offset: 4
> > +  * tab-width: 4
> > +  * indent-tabs-mode: nil
> > +  * End:
> > +  */
> > diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> > new file mode 100644
> > index 0000000000..abf4b928ce
> > --- /dev/null
> > +++ b/xen/arch/arm/suspend.c
> > @@ -0,0 +1,151 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <xen/console.h>
> > +#include <xen/cpu.h>
> > +#include <xen/llc-coloring.h>
> > +#include <xen/sched.h>
> > +
> > +/*
> > + * TODO list:
> > + *  - Test system suspend with LLC_COLORING enabled and verify functionality
> > + *  - Implement IOMMU suspend/resume handlers and integrate them
> > + *    into the suspend/resume path (IPMMU and SMMU)
> > + *  - Enable "xl suspend" support on ARM architecture
> > + *  - Properly disable Xen timer watchdog from relevant services
> > + *  - Add suspend/resume CI test for ARM (QEMU if feasible)
> > + *  - Investigate feasibility and need for implementing system suspend on ARM32
> > + */
> > +
> > +/* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
> > +static long system_suspend(void *data)
> > +{
> > +    int status;
> > +    unsigned long flags;
> > +
> > +    BUG_ON(system_state != SYS_STATE_active);
> > +
> > +    system_state = SYS_STATE_suspend;
> > +    freeze_domains();
> > +    scheduler_disable();
> > +
> > +    /*
> > +     * Non-boot CPUs have to be disabled on suspend and enabled on resume
> > +     * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
> > +     * CPU_OFF to be called by each non-boot CPU. Depending on the underlying
> > +     * platform capabilities, this may lead to the physical powering down of
> > +     * CPUs. Tested on Xilinx Zynq Ultrascale+ MPSoC (including power down of
> > +     * each non-boot CPU).
>
> I don't think that the last part of the comment is relevant in upstream.

I’ll drop that part, as it’s not relevant for upstream.

>
> > +     */
> > +    status = disable_nonboot_cpus();
> > +    if ( status )
> > +    {
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_nonboot_cpus;
> > +    }
> > +
> > +    time_suspend();
> > +
> > +    local_irq_save(flags);
> > +    status = gic_suspend();
> > +    if ( status )
> > +    {
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_irqs;
> > +    }
> > +
> > +    printk("Xen suspending...\n");
> > +
> > +    console_start_sync();
> > +    status = console_suspend();
> > +    if ( status )
> > +    {
> > +        dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", status);
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_console;
> > +    }
> > +
> > +    /*
> > +     * Enable identity mapping before entering suspend to simplify
> > +     * the resume path
> > +     */
> > +    update_boot_mapping(true);
> > +
>
> This puzzles me. I expected actually PSCI suspend call here.
>
> > +    system_state = SYS_STATE_resume;
> > +    update_boot_mapping(false);
> > +
> > + resume_console:
> > +    console_resume();
> > +    console_end_sync();
> > +
> > +    gic_resume();
> > +
> > + resume_irqs:
> > +    local_irq_restore(flags);
> > +    time_resume();
> > +
> > + resume_nonboot_cpus:
> > +    /*
> > +     * The rcu_barrier() has to be added to ensure that the per cpu area is
> > +     * freed before a non-boot CPU tries to initialize it (_free_percpu_area()
> > +     * has to be called before the init_percpu_area()). This scenario occurs
> > +     * when non-boot CPUs are hot-unplugged on suspend and hotplugged on resume.
> > +     */
> > +    rcu_barrier();
> > +    enable_nonboot_cpus();
> > +    scheduler_enable();
> > +    thaw_domains();
> > +
> > +    system_state = SYS_STATE_active;
> > +
> > +    /* The resume of hardware domain should always follow Xen's resume. */
> > +    domain_resume(hardware_domain);
> > +
> > +    printk("Resume (status %d)\n", status);
> > +    return status;
> > +}
> > +
> > +int host_system_suspend(void)
> > +{
> > +    int status;
> > +
> > +    /* TODO: drop check after verification that features can work together */
> > +    if ( llc_coloring_enabled )
> > +    {
> > +        dprintk(XENLOG_ERR,
> > +                "System suspend is not supported with LLC_COLORING enabled\n");
> > +        return -ENOSYS;
> > +    }
> > +
> > +    /*
> > +     * system_suspend should be called when Dom0 finalizes the suspend
> > +     * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
> > +     * be mapped to any PCPU (this function could be executed by any PCPU).
> > +     * The suspend procedure has to be finalized by the PCPU#0 (non-boot
> > +     * PCPUs will be disabled during the suspend).
> > +     */
> > +    status = continue_hypercall_on_cpu(0, system_suspend, NULL);
> > +
> > +    /*
> > +     * If an error happened, there is nothing that needs to be done here
> > +     * because the system_suspend always returns in fully functional state
> > +     * no matter what the outcome of suspend procedure is. If the system
> > +     * suspended successfully the function will return 0 after the resume.
> > +     * Otherwise, if an error is returned it means Xen did not suspended,
> > +     * but it is still in the same state as if the system_suspend was never
> > +     * called. We dump a debug message in case of an error for debugging/
> > +     * logging purpose.
> > +     */
> > +    if ( status )
> > +        dprintk(XENLOG_ERR, "Failed to suspend, errno=%d\n", status);
> > +
> > +    return status;
> > +}
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> > index 67d369a8a2..757e719ea7 100644
> > --- a/xen/arch/arm/vpsci.c
> > +++ b/xen/arch/arm/vpsci.c
> > @@ -4,6 +4,7 @@
> >  #include <xen/types.h>
> >
> >  #include <asm/current.h>
> > +#include <asm/suspend.h>
> >  #include <asm/vgic.h>
> >  #include <asm/vpsci.h>
> >  #include <asm/event.h>
> > @@ -214,9 +215,10 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> >      struct vcpu *v;
> >      struct domain *d = current->domain;
> >
> > -    /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> > +#ifndef CONFIG_SYSTEM_SUSPEND
> >      if ( is_hardware_domain(d) )
> >          return PSCI_NOT_SUPPORTED;
> > +#endif
> >
> >      /* Ensure that all CPUs other than the calling one are offline */
> >      domain_lock(d);
> > @@ -234,6 +236,14 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> >      if ( rc )
> >          return PSCI_DENIED;
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    if ( is_hardware_domain(d) && host_system_suspend() )
> > +    {
> > +        domain_resume_nopause(d);
> > +        return PSCI_DENIED;
> > +    }
> > +#endif
> > +
> >      rc = do_setup_vcpu_ctx(current, epoint, cid);
> >      if ( rc != PSCI_SUCCESS )
> >      {
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index c3609b0cb0..414a691242 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -1311,7 +1311,11 @@ int domain_shutdown(struct domain *d, u8 reason)
> >          d->shutdown_code = reason;
> >      reason = d->shutdown_code;
> >
> > +#if defined(CONFIG_SYSTEM_SUSPEND) && defined(CONFIG_ARM)
> > +    if ( reason != SHUTDOWN_suspend && is_hardware_domain(d) )
> > +#else
> >      if ( is_hardware_domain(d) )
> > +#endif
> >          hwdom_shutdown(reason);
> >
> >      if ( d->is_shutting_down )
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface)
  2025-08-23  1:06   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach

Hi Volodymyr,

On Sat, Aug 23, 2025 at 4:06 AM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi Mykola,
>
> Sequence of next 3 patches (and previous one) really puzzles me. Can you
> first implement hyp_resume() function, then add PSCI_SYSTEM_SUSPEND call
> and only then implement system_suspend() function? Why do this backwards?

This order comes from the first version of the patch series.
I'll reorder the commits in the next version.

>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > Invoke PSCI SYSTEM_SUSPEND to finalize Xen's suspend sequence on ARM64 platforms.
> > Pass the resume entry point (hyp_resume) as the first argument to EL3. The resume
> > handler is currently a stub and will be implemented later in assembly. Ignore the
> > context ID argument, as is done in Linux.
> >
> > Only enable this path when CONFIG_SYSTEM_SUSPEND is set and
> > PSCI version is >= 1.0.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in v4:
> >   - select the appropriate PSCI SYSTEM_SUSPEND function ID based on platform
> >   - update comments and commit message to reflect recent changes
> >
> > Changes in v3:
> >   - return PSCI_NOT_SUPPORTED instead of a hardcoded 1 on ARM32
> >   - check PSCI version before invoking SYSTEM_SUSPEND in call_psci_system_suspend
> > ---
> >  xen/arch/arm/arm64/head.S          |  8 ++++++++
> >  xen/arch/arm/include/asm/psci.h    |  1 +
> >  xen/arch/arm/include/asm/suspend.h |  2 ++
> >  xen/arch/arm/psci.c                | 23 ++++++++++++++++++++++-
> >  xen/arch/arm/suspend.c             |  5 +++++
> >  5 files changed, 38 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > index 72c7b24498..3522c497c5 100644
> > --- a/xen/arch/arm/arm64/head.S
> > +++ b/xen/arch/arm/arm64/head.S
> > @@ -561,6 +561,14 @@ END(efi_xen_start)
> >
> >  #endif /* CONFIG_ARM_EFI */
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +FUNC(hyp_resume)
> > +        b .
> > +END(hyp_resume)
> > +
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
> > +
> >  /*
> >   * Local variables:
> >   * mode: ASM
> > diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> > index 48a93e6b79..bb3c73496e 100644
> > --- a/xen/arch/arm/include/asm/psci.h
> > +++ b/xen/arch/arm/include/asm/psci.h
> > @@ -23,6 +23,7 @@ int call_psci_cpu_on(int cpu);
> >  void call_psci_cpu_off(void);
> >  void call_psci_system_off(void);
> >  void call_psci_system_reset(void);
> > +int call_psci_system_suspend(void);
> >
> >  /* Range of allocated PSCI function numbers */
> >  #define      PSCI_FNUM_MIN_VALUE                 _AC(0,U)
> > diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> > index 78d0e2383b..55041a5d06 100644
> > --- a/xen/arch/arm/include/asm/suspend.h
> > +++ b/xen/arch/arm/include/asm/suspend.h
> > @@ -7,6 +7,8 @@
> >
> >  int host_system_suspend(void);
> >
> > +void hyp_resume(void);
> > +
> >  #endif /* CONFIG_SYSTEM_SUSPEND */
> >
> >  #endif /* __ASM_ARM_SUSPEND_H__ */
> > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> > index b6860a7760..c9d126b195 100644
> > --- a/xen/arch/arm/psci.c
> > +++ b/xen/arch/arm/psci.c
> > @@ -17,17 +17,20 @@
> >  #include <asm/cpufeature.h>
> >  #include <asm/psci.h>
> >  #include <asm/acpi.h>
> > +#include <asm/suspend.h>
> >
> >  /*
> >   * While a 64-bit OS can make calls with SMC32 calling conventions, for
> >   * some calls it is necessary to use SMC64 to pass or return 64-bit values.
> > - * For such calls PSCI_0_2_FN_NATIVE(x) will choose the appropriate
> > + * For such calls PSCI_*_FN_NATIVE(x) will choose the appropriate
> >   * (native-width) function ID.
> >   */
> >  #ifdef CONFIG_ARM_64
> >  #define PSCI_0_2_FN_NATIVE(name)    PSCI_0_2_FN64_##name
> > +#define PSCI_1_0_FN_NATIVE(name)    PSCI_1_0_FN64_##name
> >  #else
> >  #define PSCI_0_2_FN_NATIVE(name)    PSCI_0_2_FN32_##name
> > +#define PSCI_1_0_FN_NATIVE(name)    PSCI_1_0_FN32_##name
> >  #endif
> >
> >  uint32_t psci_ver;
> > @@ -60,6 +63,24 @@ void call_psci_cpu_off(void)
> >      }
> >  }
> >
> > +int call_psci_system_suspend(void)
> > +{
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    struct arm_smccc_res res;
> > +
> > +    if ( psci_ver < PSCI_VERSION(1, 0) )
> > +        return PSCI_NOT_SUPPORTED;
> > +
> > +    /* 2nd argument (context ID) is not used */
> > +    arm_smccc_smc(PSCI_1_0_FN_NATIVE(SYSTEM_SUSPEND), __pa(hyp_resume), &res);
> > +    return PSCI_RET(res);
> > +#else
> > +    dprintk(XENLOG_WARNING,
> > +            "SYSTEM_SUSPEND not supported (CONFIG_SYSTEM_SUSPEND disabled)\n");
> > +    return PSCI_NOT_SUPPORTED;
> > +#endif
> > +}
> > +
> >  void call_psci_system_off(void)
> >  {
> >      if ( psci_ver > PSCI_VERSION(0, 1) )
> > diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> > index abf4b928ce..11e86b7f51 100644
> > --- a/xen/arch/arm/suspend.c
> > +++ b/xen/arch/arm/suspend.c
> > @@ -1,5 +1,6 @@
> >  /* SPDX-License-Identifier: GPL-2.0-only */
> >
> > +#include <asm/psci.h>
> >  #include <xen/console.h>
> >  #include <xen/cpu.h>
> >  #include <xen/llc-coloring.h>
> > @@ -70,6 +71,10 @@ static long system_suspend(void *data)
> >       */
> >      update_boot_mapping(true);
> >
> > +    status = call_psci_system_suspend();
> > +    if ( status )
> > +        dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
>
> So this is where missing call to PSCI_SYSTEM_SUSPEND is...
>
> > +
> >      system_state = SYS_STATE_resume;
> >      update_boot_mapping(false);
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume
  2025-08-23 17:34   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Mirela Simonovic,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Saeed Nowshadi, Mykyta Poturai, Mykola Kvach

Hi Volodymyr,

On Sat, Aug 23, 2025 at 8:34 PM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > The context of CPU general purpose and system control registers
> > has to be saved on suspend and restored on resume. This is
> > implemented in hyp_suspend and before the return from hyp_resume
> > function. The hyp_suspend is invoked just before the PSCI system
> > suspend call is issued to the ATF. The hyp_suspend has to return a
> > non-zero value so that the calling 'if' statement evaluates to true,
> > causing the system suspend to be invoked. Upon the resume, context
> > saved on suspend will be restored, including the link register.
> > Therefore, after restoring the context the control flow will
> > return to the address pointed by the saved link register, which
> > is the place from which the hyp_suspend was called. To ensure
> > that the calling 'if' statement doesn't again evaluate to true
> > and initiate system suspend, hyp_resume has to return a zero value
> > after restoring the context.
> >
> > Note that the order of saving register context into cpu_context
> > structure has to match the order of restoring.
> >
> > Support for ARM32 is not implemented. Instead, compilation fails with a
> > build-time error if suspend is enabled for ARM32.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in v4:
> > - produce build-time error for ARM32 when CONFIG_SYSTEM_SUSPEND is enabled
> > - use register_t instead of uint64_t in cpu_context structure
> > ---
> >  xen/arch/arm/arm64/head.S          | 91 +++++++++++++++++++++++++++++-
> >  xen/arch/arm/include/asm/suspend.h | 20 +++++++
> >  xen/arch/arm/suspend.c             | 23 +++++++-
> >  3 files changed, 130 insertions(+), 4 deletions(-)
> >
> > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > index 596e960152..ad8b48de3a 100644
> > --- a/xen/arch/arm/arm64/head.S
> > +++ b/xen/arch/arm/arm64/head.S
> > @@ -562,6 +562,52 @@ END(efi_xen_start)
> >  #endif /* CONFIG_ARM_EFI */
> >
> >  #ifdef CONFIG_SYSTEM_SUSPEND
> > +/*
> > + * int hyp_suspend(struct cpu_context *ptr)
> > + *
> > + * x0 - pointer to the storage where callee's context will be saved
> > + *
> > + * CPU context saved here will be restored on resume in hyp_resume function.
> > + * hyp_suspend shall return a non-zero value. Upon restoring context
> > + * hyp_resume shall return value zero instead. From C code that invokes
> > + * hyp_suspend, the return value is interpreted to determine whether the context
> > + * is saved (hyp_suspend) or restored (hyp_resume).
> > + */
> > +FUNC(hyp_suspend)
>
> I don't think that hyp_suspend is the correct name, as this function in
> fact suspend_nothing. Maybe "prepare_resume_ctx" will be better?

Not a problem to rename it.

>
> > +        /* Store callee-saved registers */
> > +        stp     x19, x20, [x0], #16
>
> If you have struct cpu_context defined, then you probably should use
> define provided by <asm-offsets.h> to access struct fields. Otherwise,
> it will be really easy to get desync between struct definition and this
> asm code.
>
> > +        stp     x21, x22, [x0], #16
> > +        stp     x23, x24, [x0], #16
> > +        stp     x25, x26, [x0], #16
> > +        stp     x27, x28, [x0], #16
> > +        stp     x29, lr, [x0], #16
> > +
> > +        /* Store stack-pointer */
> > +        mov     x2, sp
> > +        str     x2, [x0], #8
> > +
> > +        /* Store system control registers */
> > +        mrs     x2, VBAR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, VTCR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, VTTBR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, TPIDR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, MDCR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, HSTR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, CPTR_EL2
> > +        str     x2, [x0], #8
> > +        mrs     x2, HCR_EL2
> > +        str     x2, [x0], #8
> > +
> > +        /* hyp_suspend must return a non-zero value */
> > +        mov     x0, #1
> > +        ret
> > +END(hyp_suspend)
> >
> >  FUNC(hyp_resume)
> >          /* Initialize the UART if earlyprintk has been enabled. */
> > @@ -580,7 +626,50 @@ FUNC(hyp_resume)
> >          b     enable_secondary_cpu_mm
> >
> >  mmu_resumed:
> > -        b .
> > +        /*
> > +         * Now we can access the cpu_context, so restore the context here
> > +         * TODO: can we reuse __context_switch and saved_context struct here ?
> > +         */
>
> This is a great idea and I like it very much, but sadly saved_context
> struct has no fields for system _EL2 registers.
>
> > +        ldr     x0, =cpu_context
> > +
> > +        /* Restore callee-saved registers */
> > +        ldp     x19, x20, [x0], #16
> > +        ldp     x21, x22, [x0], #16
> > +        ldp     x23, x24, [x0], #16
> > +        ldp     x25, x26, [x0], #16
> > +        ldp     x27, x28, [x0], #16
> > +        ldp     x29, lr, [x0], #16
> > +
> > +        /* Restore stack pointer */
> > +        ldr     x2, [x0], #8
> > +        mov     sp, x2
> > +
> > +        /* Restore system control registers */
> > +        ldr     x2, [x0], #8
> > +        msr     VBAR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     VTCR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     VTTBR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     TPIDR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     MDCR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     HSTR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     CPTR_EL2, x2
> > +        ldr     x2, [x0], #8
> > +        msr     HCR_EL2, x2
> > +        isb
> > +
> > +        /* Since context is restored return from this function will appear as
> > +         * return from hyp_suspend. To distinguish a return from hyp_suspend
> > +         * which is called upon finalizing the suspend, as opposed to return
> > +         * from this function which executes on resume, we need to return zero
> > +         * value here. */
> > +        mov x0, #0
> > +        ret
> >  END(hyp_resume)
> >
> >  #endif /* CONFIG_SYSTEM_SUSPEND */
> > diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> > index 55041a5d06..ae71ccb87b 100644
> > --- a/xen/arch/arm/include/asm/suspend.h
> > +++ b/xen/arch/arm/include/asm/suspend.h
> > @@ -5,9 +5,29 @@
> >
> >  #ifdef CONFIG_SYSTEM_SUSPEND
> >
> > +#ifdef CONFIG_ARM_64
> > +struct cpu_context {
> > +    register_t callee_regs[12];
> > +    register_t sp;
> > +    register_t vbar_el2;
> > +    register_t vtcr_el2;
> > +    register_t vttbr_el2;
> > +    register_t tpidr_el2;
> > +    register_t mdcr_el2;
> > +    register_t hstr_el2;
> > +    register_t cptr_el2;
> > +    register_t hcr_el2;
> > +} __aligned(16);
> > +#else
> > +#error "Define cpu_context structure for arm32"
> > +#endif
> > +
> > +extern struct cpu_context cpu_context;
> > +
> >  int host_system_suspend(void);
> >
> >  void hyp_resume(void);
> > +int hyp_suspend(struct cpu_context *ptr);
> >
> >  #endif /* CONFIG_SYSTEM_SUSPEND */
> >
> > diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> > index 08b6acaede..b5398e5ca6 100644
> > --- a/xen/arch/arm/suspend.c
> > +++ b/xen/arch/arm/suspend.c
> > @@ -1,6 +1,7 @@
> >  /* SPDX-License-Identifier: GPL-2.0-only */
> >
> >  #include <asm/psci.h>
> > +#include <asm/suspend.h>
> >  #include <xen/console.h>
> >  #include <xen/cpu.h>
> >  #include <xen/llc-coloring.h>
> > @@ -17,6 +18,8 @@
> >   *  - Investigate feasibility and need for implementing system suspend on ARM32
> >   */
> >
> > +struct cpu_context cpu_context;
> > +
> >  /* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
> >  static long system_suspend(void *data)
> >  {
> > @@ -73,9 +76,23 @@ static long system_suspend(void *data)
> >       */
> >      update_boot_mapping(true);
> >
> > -    status = call_psci_system_suspend();
> > -    if ( status )
> > -        dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
> > +    if ( hyp_suspend(&cpu_context) )
> > +    {
> > +        status = call_psci_system_suspend();
> > +        /*
> > +         * If suspend is finalized properly by above system suspend PSCI call,
> > +         * the code below in this 'if' branch will never execute. Execution
> > +         * will continue from hyp_resume which is the hypervisor's resume point.
> > +         * In hyp_resume CPU context will be restored and since link-register is
> > +         * restored as well, it will appear to return from hyp_suspend. The
> > +         * difference in returning from hyp_suspend on system suspend versus
> > +         * resume is in function's return value: on suspend, the return value is
> > +         * a non-zero value, on resume it is zero. That is why the control flow
> > +         * will not re-enter this 'if' branch on resume.
> > +         */
>
> Looks like this comment is misplaced. It should be before "if (
> hyp_suspend() )", right?

Actually, the comment is correctly placed here and does not need to be moved.

>
> > +        if ( status )
> > +            dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n", status);
> > +    }
> >
> >      system_state = SYS_STATE_resume;
> >      update_boot_mapping(false);
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks
  2025-08-23 17:48   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Oleksandr Tyshchenko,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Mykola Kvach

Hi Volodymyr,

On Sat, Aug 23, 2025 at 8:48 PM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >
> > Store and restore active context and micro-TLB registers.
> >
> > Tested on R-Car H3 Starter Kit.
> >
> > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> >  xen/drivers/passthrough/arm/ipmmu-vmsa.c | 269 +++++++++++++++++++++++
> >  1 file changed, 269 insertions(+)
> >
> > diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> > index dac0dd6d46..ced762657a 100644
> > --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> > +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> > @@ -58,6 +58,8 @@
> >  #define dev_print(dev, lvl, fmt, ...)    \
> >      printk(lvl "ipmmu: %s: " fmt, dev_name(dev), ## __VA_ARGS__)
> >
> > +#define dev_dbg(dev, fmt, ...)    \
> > +    dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__)
> >  #define dev_info(dev, fmt, ...)    \
> >      dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
> >  #define dev_warn(dev, fmt, ...)    \
> > @@ -117,6 +119,23 @@ struct ipmmu_features {
> >      unsigned int imuctr_ttsel_mask;
> >  };
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +struct hw_register {
> > +    const char *reg_name;
>
> Do you really need to store register name? It is not used anywhere.

No, it’s not needed. We can remove it.
Thank you for catching that.

>
> > +    unsigned int reg_offset;
> > +    unsigned int reg_data;
> > +};
> > +
> > +struct ipmmu_vmsa_backup {
> > +    struct device *dev;
> > +    unsigned int *utlbs_val;
> > +    unsigned int *asids_val;
> > +    struct list_head list;
> > +};
> > +
> > +#endif
> > +
> >  /* Root/Cache IPMMU device's information */
> >  struct ipmmu_vmsa_device {
> >      struct device *dev;
> > @@ -129,6 +148,9 @@ struct ipmmu_vmsa_device {
> >      struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
> >      unsigned int utlb_refcount[IPMMU_UTLB_MAX];
> >      const struct ipmmu_features *features;
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    struct hw_register *reg_backup[IPMMU_CTX_MAX];
> > +#endif
> >  };
> >
> >  /*
> > @@ -534,6 +556,235 @@ static void ipmmu_domain_free_context(struct ipmmu_vmsa_device *mmu,
> >      spin_unlock_irqrestore(&mmu->lock, flags);
> >  }
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +static DEFINE_SPINLOCK(ipmmu_devices_backup_lock);
> > +static LIST_HEAD(ipmmu_devices_backup);
> > +
> > +#define HW_REGISTER_BACKUP_SIZE 4
> > +
> > +static struct hw_register root_pgtable[IPMMU_CTX_MAX][HW_REGISTER_BACKUP_SIZE] = {
> > +    [0 ... (IPMMU_CTX_MAX - 1)] = {
> > +        {"IMTTLBR0", IMTTLBR0, 0},
> > +        {"IMTTUBR0", IMTTUBR0, 0},
> > +        {"IMTTBCR",  IMTTBCR,  0},
> > +        {"IMCTR",    IMCTR,    0},
>
> Taking into account that only 4 registers needs to be saved, will it be
> easier and more efficient to have a hardcoded struct like this?
>
> struct ipmmu_reg_ctx {
>     unsigned int imttlbr0;
>     unsigned int imttubr0;
>     unsigned int imttbcr;
>     unsigned int imctr;
> }
>
> instead of hw_register[] ?
>
> Especially taking into account that struct ipmmu_vmsa_backup{} does
> exactly this.

I see no problem with doing that.

>
>
> > +    }
> > +};
> > +
> > +static uint32_t ipmmu_imuasid_read(struct ipmmu_vmsa_device *mmu,
> > +                                   unsigned int utlb)
> > +{
> > +    return ipmmu_read(mmu, ipmmu_utlb_reg(mmu, IMUASID(utlb)));
> > +}
> > +
> > +static void ipmmu_utlbs_backup(struct ipmmu_vmsa_device *mmu)
> > +{
> > +    struct ipmmu_vmsa_backup *backup_data;
> > +
> > +    dev_dbg(mmu->dev, "Handle micro-TLBs backup\n");
> > +
> > +    spin_lock(&ipmmu_devices_backup_lock);
> > +
> > +    list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
> > +    {
> > +        struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
> > +        unsigned int i;
> > +
> > +        if ( to_ipmmu(backup_data->dev) != mmu )
> > +            continue;
> > +
> > +        for ( i = 0; i < fwspec->num_ids; i++ )
> > +        {
> > +            unsigned int utlb = fwspec->ids[i];
> > +
> > +            backup_data->asids_val[i] = ipmmu_imuasid_read(mmu, utlb);
> > +            backup_data->utlbs_val[i] = ipmmu_imuctr_read(mmu, utlb);
> > +        }
> > +    }
> > +
> > +    spin_unlock(&ipmmu_devices_backup_lock);
> > +}
> > +
> > +static void ipmmu_utlbs_restore(struct ipmmu_vmsa_device *mmu)
> > +{
> > +    struct ipmmu_vmsa_backup *backup_data;
> > +
> > +    dev_dbg(mmu->dev, "Handle micro-TLBs restore\n");
> > +
> > +    spin_lock(&ipmmu_devices_backup_lock);
> > +
> > +    list_for_each_entry( backup_data, &ipmmu_devices_backup, list )
> > +    {
> > +        struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(backup_data->dev);
> > +        unsigned int i;
> > +
> > +        if ( to_ipmmu(backup_data->dev) != mmu )
> > +            continue;
> > +
> > +        for ( i = 0; i < fwspec->num_ids; i++ )
> > +        {
> > +            unsigned int utlb = fwspec->ids[i];
> > +
> > +            ipmmu_imuasid_write(mmu, utlb, backup_data->asids_val[i]);
> > +            ipmmu_imuctr_write(mmu, utlb, backup_data->utlbs_val[i]);
> > +        }
> > +    }
> > +
> > +    spin_unlock(&ipmmu_devices_backup_lock);
> > +}
> > +
> > +static void ipmmu_domain_backup_context(struct ipmmu_vmsa_domain *domain)
> > +{
> > +    struct ipmmu_vmsa_device *mmu = domain->mmu->root;
> > +    struct hw_register *reg = mmu->reg_backup[domain->context_id];
> > +    unsigned int i;
> > +
> > +    dev_dbg(mmu->dev, "Handle domain context %u backup\n", domain->context_id);
> > +
> > +    for ( i = 0; i < HW_REGISTER_BACKUP_SIZE; i++ )
> > +        reg[i].reg_data = ipmmu_ctx_read_root(domain, reg[i].reg_offset);
> > +}
> > +
> > +static void ipmmu_domain_restore_context(struct ipmmu_vmsa_domain *domain)
> > +{
> > +    struct ipmmu_vmsa_device *mmu = domain->mmu->root;
> > +    struct hw_register *reg = mmu->reg_backup[domain->context_id];
> > +    unsigned int i;
> > +
> > +    dev_dbg(mmu->dev, "Handle domain context %u restore\n", domain->context_id);
> > +
> > +    for ( i = 0; i < HW_REGISTER_BACKUP_SIZE; i++ )
> > +    {
> > +        if ( reg[i].reg_offset != IMCTR )
> > +            ipmmu_ctx_write_root(domain, reg[i].reg_offset, reg[i].reg_data);
> > +        else
> > +            ipmmu_ctx_write_all(domain, reg[i].reg_offset,
> > +                                reg[i].reg_data | IMCTR_FLUSH);
> > +    }
> > +}
> > +
> > +/*
> > + * Xen: Unlike Linux implementation, Xen uses a single driver instance
> > + * for handling all IPMMUs. There is no framework for ipmmu_suspend/resume
> > + * callbacks to be invoked for each IPMMU device. So, we need to iterate
> > + * through all registered IPMMUs performing required actions.
> > + *
> > + * Also take care of restoring special settings, such as translation
> > + * table format, etc.
> > + */
> > +static int __must_check ipmmu_suspend(void)
> > +{
> > +    struct ipmmu_vmsa_device *mmu;
> > +
> > +    if ( !iommu_enabled )
> > +        return 0;
> > +
> > +    printk(XENLOG_DEBUG "ipmmu: Suspending ...\n");
> > +
> > +    spin_lock(&ipmmu_devices_lock);
> > +
> > +    list_for_each_entry( mmu, &ipmmu_devices, list )
> > +    {
> > +        if ( ipmmu_is_root(mmu) )
> > +        {
> > +            unsigned int i;
> > +
> > +            for ( i = 0; i < mmu->num_ctx; i++ )
> > +            {
> > +                if ( !mmu->domains[i] )
> > +                    continue;
> > +                ipmmu_domain_backup_context(mmu->domains[i]);
> > +            }
> > +        }
> > +        else
> > +            ipmmu_utlbs_backup(mmu);
> > +    }
> > +
> > +    spin_unlock(&ipmmu_devices_lock);
> > +
> > +    return 0;
> > +}
> > +
> > +static void ipmmu_resume(void)
> > +{
> > +    struct ipmmu_vmsa_device *mmu;
> > +
> > +    if ( !iommu_enabled )
> > +        return;
> > +
> > +    printk(XENLOG_DEBUG "ipmmu: Resuming ...\n");
> > +
> > +    spin_lock(&ipmmu_devices_lock);
> > +
> > +    list_for_each_entry( mmu, &ipmmu_devices, list )
> > +    {
> > +        uint32_t reg;
> > +
> > +        /* Do not use security group function */
> > +        reg = IMSCTLR + mmu->features->control_offset_base;
> > +        ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) & ~IMSCTLR_USE_SECGRP);
> > +
> > +        if ( ipmmu_is_root(mmu) )
> > +        {
> > +            unsigned int i;
> > +
> > +            /* Use stage 2 translation table format */
> > +            reg = IMSAUXCTLR + mmu->features->control_offset_base;
> > +            ipmmu_write(mmu, reg, ipmmu_read(mmu, reg) | IMSAUXCTLR_S2PTE);
> > +
> > +            for ( i = 0; i < mmu->num_ctx; i++ )
> > +            {
> > +                if ( !mmu->domains[i] )
> > +                    continue;
> > +                ipmmu_domain_restore_context(mmu->domains[i]);
> > +            }
> > +        }
> > +        else
> > +            ipmmu_utlbs_restore(mmu);
> > +    }
> > +
> > +    spin_unlock(&ipmmu_devices_lock);
> > +}
> > +
> > +static int ipmmu_alloc_ctx_suspend(struct device *dev)
> > +{
> > +    struct ipmmu_vmsa_backup *backup_data;
> > +    unsigned int *utlbs_val, *asids_val;
> > +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > +
> > +    utlbs_val = xzalloc_array(unsigned int, fwspec->num_ids);
> > +    if ( !utlbs_val )
> > +        return -ENOMEM;
> > +
> > +    asids_val = xzalloc_array(unsigned int, fwspec->num_ids);
> > +    if ( !asids_val )
> > +    {
> > +        xfree(utlbs_val);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    backup_data = xzalloc(struct ipmmu_vmsa_backup);
> > +    if ( !backup_data )
> > +    {
> > +        xfree(utlbs_val);
> > +        xfree(asids_val);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    backup_data->dev = dev;
> > +    backup_data->utlbs_val = utlbs_val;
> > +    backup_data->asids_val = asids_val;
> > +
> > +    spin_lock(&ipmmu_devices_backup_lock);
> > +    list_add(&backup_data->list, &ipmmu_devices_backup);
> > +    spin_unlock(&ipmmu_devices_backup_lock);
> > +
> > +    return 0;
> > +}
> > +
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
> > +
> >  static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
> >  {
> >      uint64_t ttbr;
> > @@ -546,6 +797,9 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
> >          return ret;
> >
> >      domain->context_id = ret;
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    domain->mmu->root->reg_backup[ret] = root_pgtable[ret];
> > +#endif
> >
> >      /*
> >       * TTBR0
> > @@ -602,6 +856,9 @@ static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
> >      ipmmu_ctx_write_root(domain, IMCTR, IMCTR_FLUSH);
> >      ipmmu_tlb_sync(domain);
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    domain->mmu->root->reg_backup[domain->context_id] = NULL;
> > +#endif
> >      ipmmu_domain_free_context(domain->mmu->root, domain->context_id);
> >  }
> >
> > @@ -1307,6 +1564,14 @@ static int ipmmu_add_device(u8 devfn, struct device *dev)
> >      /* Let Xen know that the master device is protected by an IOMMU. */
> >      dt_device_set_protected(dev_to_dt(dev));
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    if ( ipmmu_alloc_ctx_suspend(dev) )
> > +    {
> > +        dev_err(dev, "Failed to allocate context for suspend\n");
> > +        return -ENOMEM;
> > +    }
> > +#endif
> > +
> >      dev_info(dev, "Added master device (IPMMU %s micro-TLBs %u)\n",
> >               dev_name(fwspec->iommu_dev), fwspec->num_ids);
> >
> > @@ -1372,6 +1637,10 @@ static const struct iommu_ops ipmmu_iommu_ops =
> >      .unmap_page      = arm_iommu_unmap_page,
> >      .dt_xlate        = ipmmu_dt_xlate,
> >      .add_device      = ipmmu_add_device,
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +    .suspend         = ipmmu_suspend,
> > +    .resume          = ipmmu_resume,
> > +#endif
> >  };
> >
> >  static __init int ipmmu_init(struct dt_device_node *node, const void *data)
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume
  2025-08-23 17:54   ` Volodymyr Babchuk
@ 2025-08-26 13:42     ` Mykola Kvach
  2025-08-26 15:01       ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 13:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Oleksandr Tyshchenko,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Mykola Kvach, Oleksandr Andrushchenko

Hi Volodymyr,

On Sat, Aug 23, 2025 at 8:55 PM Volodymyr Babchuk
<Volodymyr_Babchuk@epam.com> wrote:
>
> Hi,
>
> Mykola Kvach <xakep.amatop@gmail.com> writes:
>
> > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >
> > This is done using generic iommu_suspend/resume functions that cause
> > IOMMU driver specific suspend/resume handlers to be called for enabled
> > IOMMU (if one has suspend/resume driver handlers implemented).
> >
> > These handlers work only when IPMMU-VMSA is used; otherwise,
> > we return an error during system suspend requests.
> >
> > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > Tested-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > ---
> >  xen/arch/arm/suspend.c | 21 +++++++++++++++++++++
> >  1 file changed, 21 insertions(+)
> >
> > diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> > index b5398e5ca6..cb86426ebd 100644
> > --- a/xen/arch/arm/suspend.c
> > +++ b/xen/arch/arm/suspend.c
> > @@ -4,6 +4,7 @@
> >  #include <asm/suspend.h>
> >  #include <xen/console.h>
> >  #include <xen/cpu.h>
> > +#include <xen/iommu.h>
> >  #include <xen/llc-coloring.h>
> >  #include <xen/sched.h>
> >
> > @@ -49,6 +50,13 @@ static long system_suspend(void *data)
> >
> >      time_suspend();
> >
> > +    status = iommu_suspend();
> > +    if ( status )
> > +    {
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_time;
> > +    }
> > +
> >      local_irq_save(flags);
> >      status = gic_suspend();
> >      if ( status )
> > @@ -105,6 +113,10 @@ static long system_suspend(void *data)
> >
> >   resume_irqs:
> >      local_irq_restore(flags);
> > +
> > +    iommu_resume();
> > +
> > + resume_time:
> >      time_resume();
> >
> >   resume_nonboot_cpus:
> > @@ -140,6 +152,15 @@ int host_system_suspend(void)
> >          return -ENOSYS;
> >      }
> >
> > +    /* TODO: drop check once suspend/resume support for SMMU is implemented */
> > +#ifndef CONFIG_IPMMU_VMSA
>
> This check is meaningless, as you can have CONFIG_IPMMU_VMSA enabled
> along with CONFIG_ARM_SMMU on a system with SMMU. I think it is enough
> that iommu_suspend() will fail during system_suspend(), isn't it?

Not exactly. In the case of SMMU, we don’t have valid iommu_suspend/iommu_resume
handlers. So it’s possible to have CONFIG_ARM_SMMU enabled and iommu_enabled
set, but without the appropriate suspend handlers. This could lead to
a crash during
system_suspend().


>
>
> > +    if ( iommu_enabled )
> > +    {
> > +        dprintk(XENLOG_ERR, "IOMMU is enabled, suspend not supported yet\n");
> > +        return -ENOSYS;
> > +    }
> > +#endif
> > +
> >      /*
> >       * system_suspend should be called when Dom0 finalizes the suspend
> >       * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
>
> --
> WBR, Volodymyr

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume
  2025-08-26 13:42     ` Mykola Kvach
@ 2025-08-26 15:01       ` Oleksandr Tyshchenko
  2025-08-26 16:17         ` Mykola Kvach
  0 siblings, 1 reply; 38+ messages in thread
From: Oleksandr Tyshchenko @ 2025-08-26 15:01 UTC (permalink / raw)
  To: Mykola Kvach, Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Oleksandr Tyshchenko,
	Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Mykola Kvach, Oleksandr Andrushchenko



On 26.08.25 16:42, Mykola Kvach wrote:

Hello Mykola, Volodymyr


> Hi Volodymyr,
> 
> On Sat, Aug 23, 2025 at 8:55 PM Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com> wrote:
>>
>> Hi,
>>
>> Mykola Kvach <xakep.amatop@gmail.com> writes:
>>
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> This is done using generic iommu_suspend/resume functions that cause
>>> IOMMU driver specific suspend/resume handlers to be called for enabled
>>> IOMMU (if one has suspend/resume driver handlers implemented).
>>>
>>> These handlers work only when IPMMU-VMSA is used; otherwise,
>>> we return an error during system suspend requests.
>>>
>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>> Tested-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

I think, the Tested-by here should be dropped. A lot of time has passed 
since Oleksandr provided the tag, and the code has changed a bit (I am 
afraid, the tag is now stale).


>>> ---
>>>   xen/arch/arm/suspend.c | 21 +++++++++++++++++++++
>>>   1 file changed, 21 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
>>> index b5398e5ca6..cb86426ebd 100644
>>> --- a/xen/arch/arm/suspend.c
>>> +++ b/xen/arch/arm/suspend.c
>>> @@ -4,6 +4,7 @@
>>>   #include <asm/suspend.h>
>>>   #include <xen/console.h>
>>>   #include <xen/cpu.h>
>>> +#include <xen/iommu.h>
>>>   #include <xen/llc-coloring.h>
>>>   #include <xen/sched.h>
>>>
>>> @@ -49,6 +50,13 @@ static long system_suspend(void *data)
>>>
>>>       time_suspend();
>>>
>>> +    status = iommu_suspend();
>>> +    if ( status )
>>> +    {
>>> +        system_state = SYS_STATE_resume;
>>> +        goto resume_time;
>>> +    }
>>> +
>>>       local_irq_save(flags);
>>>       status = gic_suspend();
>>>       if ( status )
>>> @@ -105,6 +113,10 @@ static long system_suspend(void *data)
>>>
>>>    resume_irqs:
>>>       local_irq_restore(flags);
>>> +
>>> +    iommu_resume();
>>> +
>>> + resume_time:
>>>       time_resume();
>>>
>>>    resume_nonboot_cpus:
>>> @@ -140,6 +152,15 @@ int host_system_suspend(void)
>>>           return -ENOSYS;
>>>       }
>>>
>>> +    /* TODO: drop check once suspend/resume support for SMMU is implemented */
>>> +#ifndef CONFIG_IPMMU_VMSA

The original patch did not seem to have this check...

>>
>> This check is meaningless, as you can have CONFIG_IPMMU_VMSA enabled
>> along with CONFIG_ARM_SMMU on a system with SMMU. I think it is enough
>> that iommu_suspend() will fail during system_suspend(), isn't it?
> 
> Not exactly. In the case of SMMU, we don’t have valid iommu_suspend/iommu_resume
> handlers. So it’s possible to have CONFIG_ARM_SMMU enabled and iommu_enabled
> set, but without the appropriate suspend handlers. This could lead to
> a crash during
> system_suspend().

  ... I also have doubts whether this check is needed (at least in its 
current shape). Xen has 2 SMMU drivers at the moment. Lets imagine that 
S2R for SMMUv3 is implemented, so we will need to extend #ifndef above 
to cover CONFIG_ARM_SMMU_V3 as well, right (otherwise an attempt to 
suspend SMMUv2-powered system will lead to crash)? Or there is a plan to 
implement S2R for both SMMU implementations?

***

If we care for possible crash because iommu_suspend is NULL for 
SMMUv2/SMMUv3, maybe we should consider adding stub callbacks to the 
both SMMU drivers, just returning -ENOSYS?

Let's see what other people's opinions are.

> 
> 
>>
>>
>>> +    if ( iommu_enabled )
>>> +    {
>>> +        dprintk(XENLOG_ERR, "IOMMU is enabled, suspend not supported yet\n");
>>> +        return -ENOSYS;
>>> +    }
>>> +#endif
>>> +
>>>       /*
>>>        * system_suspend should be called when Dom0 finalizes the suspend
>>>        * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
>>
>> --
>> WBR, Volodymyr
> 
> Best regards,
> Mykola
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume
  2025-08-26 15:01       ` Oleksandr Tyshchenko
@ 2025-08-26 16:17         ` Mykola Kvach
  0 siblings, 0 replies; 38+ messages in thread
From: Mykola Kvach @ 2025-08-26 16:17 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Volodymyr Babchuk, xen-devel@lists.xenproject.org,
	Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Mykola Kvach,
	Oleksandr Andrushchenko

Hi Oleksandr,

On Tue, Aug 26, 2025 at 6:01 PM Oleksandr Tyshchenko
<olekstysh@gmail.com> wrote:
>
>
>
> On 26.08.25 16:42, Mykola Kvach wrote:
>
> Hello Mykola, Volodymyr
>
>
> > Hi Volodymyr,
> >
> > On Sat, Aug 23, 2025 at 8:55 PM Volodymyr Babchuk
> > <Volodymyr_Babchuk@epam.com> wrote:
> >>
> >> Hi,
> >>
> >> Mykola Kvach <xakep.amatop@gmail.com> writes:
> >>
> >>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >>>
> >>> This is done using generic iommu_suspend/resume functions that cause
> >>> IOMMU driver specific suspend/resume handlers to be called for enabled
> >>> IOMMU (if one has suspend/resume driver handlers implemented).
> >>>
> >>> These handlers work only when IPMMU-VMSA is used; otherwise,
> >>> we return an error during system suspend requests.
> >>>
> >>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> >>> Tested-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> I think, the Tested-by here should be dropped. A lot of time has passed
> since Oleksandr provided the tag, and the code has changed a bit (I am
> afraid, the tag is now stale).

Got it, I’ll drop the Tested-by tag in the next version.

>
>
> >>> ---
> >>>   xen/arch/arm/suspend.c | 21 +++++++++++++++++++++
> >>>   1 file changed, 21 insertions(+)
> >>>
> >>> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> >>> index b5398e5ca6..cb86426ebd 100644
> >>> --- a/xen/arch/arm/suspend.c
> >>> +++ b/xen/arch/arm/suspend.c
> >>> @@ -4,6 +4,7 @@
> >>>   #include <asm/suspend.h>
> >>>   #include <xen/console.h>
> >>>   #include <xen/cpu.h>
> >>> +#include <xen/iommu.h>
> >>>   #include <xen/llc-coloring.h>
> >>>   #include <xen/sched.h>
> >>>
> >>> @@ -49,6 +50,13 @@ static long system_suspend(void *data)
> >>>
> >>>       time_suspend();
> >>>
> >>> +    status = iommu_suspend();
> >>> +    if ( status )
> >>> +    {
> >>> +        system_state = SYS_STATE_resume;
> >>> +        goto resume_time;
> >>> +    }
> >>> +
> >>>       local_irq_save(flags);
> >>>       status = gic_suspend();
> >>>       if ( status )
> >>> @@ -105,6 +113,10 @@ static long system_suspend(void *data)
> >>>
> >>>    resume_irqs:
> >>>       local_irq_restore(flags);
> >>> +
> >>> +    iommu_resume();
> >>> +
> >>> + resume_time:
> >>>       time_resume();
> >>>
> >>>    resume_nonboot_cpus:
> >>> @@ -140,6 +152,15 @@ int host_system_suspend(void)
> >>>           return -ENOSYS;
> >>>       }
> >>>
> >>> +    /* TODO: drop check once suspend/resume support for SMMU is implemented */
> >>> +#ifndef CONFIG_IPMMU_VMSA
>
> The original patch did not seem to have this check...
>
> >>
> >> This check is meaningless, as you can have CONFIG_IPMMU_VMSA enabled
> >> along with CONFIG_ARM_SMMU on a system with SMMU. I think it is enough
> >> that iommu_suspend() will fail during system_suspend(), isn't it?
> >
> > Not exactly. In the case of SMMU, we don’t have valid iommu_suspend/iommu_resume
> > handlers. So it’s possible to have CONFIG_ARM_SMMU enabled and iommu_enabled
> > set, but without the appropriate suspend handlers. This could lead to
> > a crash during
> > system_suspend().
>
>   ... I also have doubts whether this check is needed (at least in its
> current shape). Xen has 2 SMMU drivers at the moment. Lets imagine that
> S2R for SMMUv3 is implemented, so we will need to extend #ifndef above
> to cover CONFIG_ARM_SMMU_V3 as well, right (otherwise an attempt to
> suspend SMMUv2-powered system will lead to crash)? Or there is a plan to
> implement S2R for both SMMU implementations?
>
> ***
>
> If we care for possible crash because iommu_suspend is NULL for
> SMMUv2/SMMUv3, maybe we should consider adding stub callbacks to the
> both SMMU drivers, just returning -ENOSYS?

I’m fine with that proposal, adding stub callbacks sounds like a clean
solution.

>
> Let's see what other people's opinions are.
>
> >
> >
> >>
> >>
> >>> +    if ( iommu_enabled )
> >>> +    {
> >>> +        dprintk(XENLOG_ERR, "IOMMU is enabled, suspend not supported yet\n");
> >>> +        return -ENOSYS;
> >>> +    }
> >>> +#endif
> >>> +
> >>>       /*
> >>>        * system_suspend should be called when Dom0 finalizes the suspend
> >>>        * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
> >>
> >> --
> >> WBR, Volodymyr
> >
> > Best regards,
> > Mykola
> >
>

Best regards,
Mykola


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2025-08-26 16:18 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 20:47 [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Mykola Kvach
2025-08-11 20:47 ` [PATCH v5 01/12] xen/arm: Add suspend and resume timer helpers Mykola Kvach
2025-08-11 20:47 ` [PATCH v5 02/12] xen/arm: gic-v2: Implement GIC suspend/resume functions Mykola Kvach
2025-08-23  0:01   ` Volodymyr Babchuk
2025-08-26 13:41     ` Mykola Kvach
2025-08-11 20:47 ` [PATCH v5 03/12] xen/arm: gic-v3: Implement GICv3 " Mykola Kvach
2025-08-23  0:20   ` Volodymyr Babchuk
2025-08-26 13:41     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 04/12] xen/arm: Prevent crash during disable_nonboot_cpus on suspend Mykola Kvach
2025-08-23  0:36   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 05/12] xen/arm: irq: avoid local IRQ descriptors reinit on system resume Mykola Kvach
2025-08-23  0:37   ` Volodymyr Babchuk
2025-08-11 20:48 ` [PATCH v5 06/12] xen/arm: irq: Restore state of local IRQs during " Mykola Kvach
2025-08-23  0:45   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 07/12] xen/arm: Add support for system suspend triggered by hardware domain Mykola Kvach
2025-08-12  7:18   ` Jan Beulich
2025-08-23  0:53     ` Volodymyr Babchuk
2025-08-23  1:00   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 08/12] xen/arm: Implement PSCI SYSTEM_SUSPEND call (host interface) Mykola Kvach
2025-08-23  1:06   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 09/12] xen/arm: Resume memory management on Xen resume Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 10/12] xen/arm: Save/restore context on suspend/resume Mykola Kvach
2025-08-23 17:34   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 11/12] iommu/ipmmu-vmsa: Implement suspend/resume callbacks Mykola Kvach
2025-08-23 17:48   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-11 20:48 ` [PATCH v5 12/12] xen/arm: Suspend/resume IOMMU on Xen suspend/resume Mykola Kvach
2025-08-23 17:54   ` Volodymyr Babchuk
2025-08-26 13:42     ` Mykola Kvach
2025-08-26 15:01       ` Oleksandr Tyshchenko
2025-08-26 16:17         ` Mykola Kvach
2025-08-11 21:53 ` [PATCH v5 00/12] Add initial Xen Suspend-to-RAM support on ARM64 Julien Grall
2025-08-12  4:58   ` Mykola Kvach

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.