* [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 16:45 ` Jan Beulich
2025-03-11 20:27 ` Julien Grall
2025-03-05 9:11 ` [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files Mykola Kvach
` (16 subsequent siblings)
17 siblings, 2 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel; +Cc: Mykyta Poturai, Jan Beulich, Roger Pau Monné, Mykola Kvach
From: Mykyta Poturai <mykyta_poturai@epam.com>
These functions may be unimplemented, so check that they exist before
calling to prevent crashes.
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Introduced in patch series V3.
---
xen/drivers/passthrough/iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 16aad86973..55b33c9719 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -613,7 +613,7 @@ int __init iommu_setup(void)
int iommu_suspend(void)
{
- if ( iommu_enabled )
+ if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->suspend )
return iommu_call(iommu_get_ops(), suspend);
return 0;
@@ -621,7 +621,7 @@ int iommu_suspend(void)
void iommu_resume(void)
{
- if ( iommu_enabled )
+ if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->resume )
iommu_vcall(iommu_get_ops(), resume);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume
2025-03-05 9:11 ` [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume Mykola Kvach
@ 2025-03-05 16:45 ` Jan Beulich
2025-03-19 12:01 ` Mykola Kvach
2025-03-11 20:27 ` Julien Grall
1 sibling, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-05 16:45 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykyta Poturai, Roger Pau Monné, Mykola Kvach, xen-devel
On 05.03.2025 10:11, Mykola Kvach wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -613,7 +613,7 @@ int __init iommu_setup(void)
>
> int iommu_suspend(void)
> {
> - if ( iommu_enabled )
> + if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->suspend )
> return iommu_call(iommu_get_ops(), suspend);
>
> return 0;
> @@ -621,7 +621,7 @@ int iommu_suspend(void)
>
> void iommu_resume(void)
> {
> - if ( iommu_enabled )
> + if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->resume )
> iommu_vcall(iommu_get_ops(), resume);
> }
When iommu_enabled is true, surely iommu_get_ops() is required to return
non-NULL?
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume
2025-03-05 16:45 ` Jan Beulich
@ 2025-03-19 12:01 ` Mykola Kvach
2025-03-19 12:45 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-19 12:01 UTC (permalink / raw)
To: Jan Beulich; +Cc: Mykyta Poturai, Roger Pau Monné, Mykola Kvach, xen-devel
Hi,
On Wed, Mar 5, 2025 at 6:45 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2025 10:11, Mykola Kvach wrote:
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -613,7 +613,7 @@ int __init iommu_setup(void)
> >
> > int iommu_suspend(void)
> > {
> > - if ( iommu_enabled )
> > + if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->suspend )
> > return iommu_call(iommu_get_ops(), suspend);
> >
> > return 0;
> > @@ -621,7 +621,7 @@ int iommu_suspend(void)
> >
> > void iommu_resume(void)
> > {
> > - if ( iommu_enabled )
> > + if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->resume )
> > iommu_vcall(iommu_get_ops(), resume);
> > }
>
> When iommu_enabled is true, surely iommu_get_ops() is required to return
> non-NULL?
As far as I can see, in some cases, the handler is still checked even
if iommu_enabled
is true, such as in the case of the iommu_quiesce call. However, it
might be better to drop
this patch from the current patch series or add a patch that
introduces the handlers.
>
> Jan
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume
2025-03-19 12:01 ` Mykola Kvach
@ 2025-03-19 12:45 ` Jan Beulich
0 siblings, 0 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-19 12:45 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykyta Poturai, Roger Pau Monné, Mykola Kvach, xen-devel
On 19.03.2025 13:01, Mykola Kvach wrote:
> Hi,
>
> On Wed, Mar 5, 2025 at 6:45 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>> --- a/xen/drivers/passthrough/iommu.c
>>> +++ b/xen/drivers/passthrough/iommu.c
>>> @@ -613,7 +613,7 @@ int __init iommu_setup(void)
>>>
>>> int iommu_suspend(void)
>>> {
>>> - if ( iommu_enabled )
>>> + if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->suspend )
>>> return iommu_call(iommu_get_ops(), suspend);
>>>
>>> return 0;
>>> @@ -621,7 +621,7 @@ int iommu_suspend(void)
>>>
>>> void iommu_resume(void)
>>> {
>>> - if ( iommu_enabled )
>>> + if ( iommu_enabled && iommu_get_ops() && iommu_get_ops()->resume )
>>> iommu_vcall(iommu_get_ops(), resume);
>>> }
>>
>> When iommu_enabled is true, surely iommu_get_ops() is required to return
>> non-NULL?
>
> As far as I can see, in some cases, the handler is still checked even
> if iommu_enabled
> is true, such as in the case of the iommu_quiesce call.
You say "handler" and also refer to a case where the handler is checked.
My comment was about the bare iommu_get_ops() though.
> However, it
> might be better to drop
> this patch from the current patch series or add a patch that
> introduces the handlers.
Only if they're not merely stubs.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume
2025-03-05 9:11 ` [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume Mykola Kvach
2025-03-05 16:45 ` Jan Beulich
@ 2025-03-11 20:27 ` Julien Grall
2025-03-19 12:02 ` Mykola Kvach
1 sibling, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 20:27 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykyta Poturai, Jan Beulich, Roger Pau Monné, Mykola Kvach
Hi,
On 05/03/2025 09:11, Mykola Kvach wrote:
> From: Mykyta Poturai <mykyta_poturai@epam.com>
>
> These functions may be unimplemented, so check that they exist before
> calling to prevent crashes.
Looking at the cover letter, I see you wrote the following:
"Add suspend/resume handlers to IOMMU drivers (there aren’t any
problems with the current implementation because the domains used for
test are thin, and this patch series implements only the very basic
logic)"
which I read as this patch is a temporary hack until we implement IOMMU.
Is that correct? If so, can you tag it as HACK and move to the end to
end up to merge it?
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume
2025-03-11 20:27 ` Julien Grall
@ 2025-03-19 12:02 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-19 12:02 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, Mykyta Poturai, Jan Beulich, Roger Pau Monné,
Mykola Kvach
Hi,
On Tue, Mar 11, 2025 at 10:28 PM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 05/03/2025 09:11, Mykola Kvach wrote:
> > From: Mykyta Poturai <mykyta_poturai@epam.com>
> >
> > These functions may be unimplemented, so check that they exist before
> > calling to prevent crashes.
>
> Looking at the cover letter, I see you wrote the following:
>
> "Add suspend/resume handlers to IOMMU drivers (there aren’t any
> problems with the current implementation because the domains used for
> test are thin, and this patch series implements only the very basic
> logic)"
>
> which I read as this patch is a temporary hack until we implement IOMMU.
> Is that correct? If so, can you tag it as HACK and move to the end to
> end up to merge it?
Yes, you're right—if we have handlers for suspend/resume in the IOMMU driver,
we don't need this patch at all. However, if we drop the iommu_suspend/resume
calls from the system suspend, this commit becomes unnecessary, even within
this patch series.
>
> Cheers,
>
> --
> Julien Grall
>
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
2025-03-05 9:11 ` [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 16:48 ` Jan Beulich
2025-03-05 9:11 ` [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers Mykola Kvach
` (15 subsequent siblings)
17 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, Saeed Nowshadi, Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
These functions will be reused by suspend/resume support for ARM.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
xen/arch/x86/acpi/power.c | 29 -----------------------------
xen/common/domain.c | 30 ++++++++++++++++++++++++++++++
xen/include/xen/sched.h | 3 +++
3 files changed, 33 insertions(+), 29 deletions(-)
diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index d0b67614d5..f38398827e 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -137,35 +137,6 @@ static void device_power_up(enum dev_power_saved saved)
}
}
-static void freeze_domains(void)
-{
- struct domain *d;
-
- rcu_read_lock(&domlist_read_lock);
- /*
- * Note that we iterate in order of domain-id. Hence we will pause dom0
- * first which is required for correctness (as only dom0 can add domains to
- * the domain list). Otherwise we could miss concurrently-created domains.
- */
- for_each_domain ( d )
- domain_pause(d);
- rcu_read_unlock(&domlist_read_lock);
-
- scheduler_disable();
-}
-
-static void thaw_domains(void)
-{
- struct domain *d;
-
- scheduler_enable();
-
- rcu_read_lock(&domlist_read_lock);
- for_each_domain ( d )
- domain_unpause(d);
- rcu_read_unlock(&domlist_read_lock);
-}
-
static void acpi_sleep_prepare(u32 state)
{
void *wakeup_vector_va;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 0c4cc77111..49ff84d2f5 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -2259,6 +2259,36 @@ int continue_hypercall_on_cpu(
return 0;
}
+
+void freeze_domains(void)
+{
+ struct domain *d;
+
+ rcu_read_lock(&domlist_read_lock);
+ /*
+ * Note that we iterate in order of domain-id. Hence we will pause dom0
+ * first which is required for correctness (as only dom0 can add domains to
+ * the domain list). Otherwise we could miss concurrently-created domains.
+ */
+ for_each_domain ( d )
+ domain_pause(d);
+ rcu_read_unlock(&domlist_read_lock);
+
+ scheduler_disable();
+}
+
+void thaw_domains(void)
+{
+ struct domain *d;
+
+ scheduler_enable();
+
+ rcu_read_lock(&domlist_read_lock);
+ for_each_domain ( d )
+ domain_unpause(d);
+ rcu_read_unlock(&domlist_read_lock);
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 037c83fda2..177784e6da 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1059,6 +1059,9 @@ static inline struct vcpu *domain_vcpu(const struct domain *d,
return vcpu_id >= d->max_vcpus ? NULL : d->vcpu[idx];
}
+void freeze_domains(void);
+void thaw_domains(void);
+
void cpu_init(void);
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files
2025-03-05 9:11 ` [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files Mykola Kvach
@ 2025-03-05 16:48 ` Jan Beulich
2025-03-19 12:03 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-05 16:48 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mirela Simonovic, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
Saeed Nowshadi, Mykyta Poturai, Mykola Kvach, xen-devel
On 05.03.2025 10:11, Mykola Kvach wrote:
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> These functions will be reused by suspend/resume support for ARM.
And until then they are going to violate the Misra rule requiring there
to not be unreachable code.
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -2259,6 +2259,36 @@ int continue_hypercall_on_cpu(
> return 0;
> }
>
> +
> +void freeze_domains(void)
Nit: No double blank lines please.
> +{
> + struct domain *d;
> +
> + rcu_read_lock(&domlist_read_lock);
> + /*
> + * Note that we iterate in order of domain-id. Hence we will pause dom0
> + * first which is required for correctness (as only dom0 can add domains to
> + * the domain list). Otherwise we could miss concurrently-created domains.
> + */
> + for_each_domain ( d )
> + domain_pause(d);
> + rcu_read_unlock(&domlist_read_lock);
> +
> + scheduler_disable();
When made generally available I'm unsure having this and ...
> +}
> +
> +void thaw_domains(void)
> +{
> + struct domain *d;
> +
> + scheduler_enable();
... this here is a good idea. Both scheduler operations aren't related
to what the function names say is being done here.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files
2025-03-05 16:48 ` Jan Beulich
@ 2025-03-19 12:03 ` Mykola Kvach
2025-03-19 12:47 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-19 12:03 UTC (permalink / raw)
To: Jan Beulich
Cc: Mirela Simonovic, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
Saeed Nowshadi, Mykyta Poturai, Mykola Kvach, xen-devel
On Wed, Mar 5, 2025 at 6:48 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2025 10:11, Mykola Kvach wrote:
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > These functions will be reused by suspend/resume support for ARM.
>
> And until then they are going to violate the Misra rule requiring there
> to not be unreachable code.
>
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -2259,6 +2259,36 @@ int continue_hypercall_on_cpu(
> > return 0;
> > }
> >
> > +
> > +void freeze_domains(void)
>
> Nit: No double blank lines please.
Thanks for pointing that out! I'll fix it in the next version of the
patch series.
>
> > +{
> > + struct domain *d;
> > +
> > + rcu_read_lock(&domlist_read_lock);
> > + /*
> > + * Note that we iterate in order of domain-id. Hence we will pause dom0
> > + * first which is required for correctness (as only dom0 can add domains to
> > + * the domain list). Otherwise we could miss concurrently-created domains.
> > + */
> > + for_each_domain ( d )
> > + domain_pause(d);
> > + rcu_read_unlock(&domlist_read_lock);
> > +
> > + scheduler_disable();
>
> When made generally available I'm unsure having this and ...
>
> > +}
> > +
> > +void thaw_domains(void)
> > +{
> > + struct domain *d;
> > +
> > + scheduler_enable();
>
> ... this here is a good idea. Both scheduler operations aren't related
> to what the function names say is being done here.
I have just moved these functions from x86-specific headers to a common one,
but they are still used only for suspend/resume purposes.
It's not a problem for me to adjust the names slightly in the next
version of the
patch series.
>
> Jan
Best regards,
~Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files
2025-03-19 12:03 ` Mykola Kvach
@ 2025-03-19 12:47 ` Jan Beulich
2025-03-20 9:02 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-19 12:47 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mirela Simonovic, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
Saeed Nowshadi, Mykyta Poturai, Mykola Kvach, xen-devel
On 19.03.2025 13:03, Mykola Kvach wrote:
> On Wed, Mar 5, 2025 at 6:48 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>>>
>>> These functions will be reused by suspend/resume support for ARM.
>>
>> And until then they are going to violate the Misra rule requiring there
>> to not be unreachable code.
>>
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -2259,6 +2259,36 @@ int continue_hypercall_on_cpu(
>>> return 0;
>>> }
>>>
>>> +
>>> +void freeze_domains(void)
>>
>> Nit: No double blank lines please.
>
> Thanks for pointing that out! I'll fix it in the next version of the
> patch series.
>
>>
>>> +{
>>> + struct domain *d;
>>> +
>>> + rcu_read_lock(&domlist_read_lock);
>>> + /*
>>> + * Note that we iterate in order of domain-id. Hence we will pause dom0
>>> + * first which is required for correctness (as only dom0 can add domains to
>>> + * the domain list). Otherwise we could miss concurrently-created domains.
>>> + */
>>> + for_each_domain ( d )
>>> + domain_pause(d);
>>> + rcu_read_unlock(&domlist_read_lock);
>>> +
>>> + scheduler_disable();
>>
>> When made generally available I'm unsure having this and ...
>>
>>> +}
>>> +
>>> +void thaw_domains(void)
>>> +{
>>> + struct domain *d;
>>> +
>>> + scheduler_enable();
>>
>> ... this here is a good idea. Both scheduler operations aren't related
>> to what the function names say is being done here.
>
> I have just moved these functions from x86-specific headers to a common one,
> but they are still used only for suspend/resume purposes.
> It's not a problem for me to adjust the names slightly in the next
> version of the
> patch series.
I wasn't after a rename really; my suggestion was to leave the scheduler
calls at the original call sites, and remove them from here.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files
2025-03-19 12:47 ` Jan Beulich
@ 2025-03-20 9:02 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-20 9:02 UTC (permalink / raw)
To: Jan Beulich
Cc: Mirela Simonovic, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
Saeed Nowshadi, Mykyta Poturai, Mykola Kvach, xen-devel
On Wed, Mar 19, 2025 at 2:47 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.03.2025 13:03, Mykola Kvach wrote:
> > On Wed, Mar 5, 2025 at 6:48 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 05.03.2025 10:11, Mykola Kvach wrote:
> >>> From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >>>
> >>> These functions will be reused by suspend/resume support for ARM.
> >>
> >> And until then they are going to violate the Misra rule requiring there
> >> to not be unreachable code.
> >>
> >>> --- a/xen/common/domain.c
> >>> +++ b/xen/common/domain.c
> >>> @@ -2259,6 +2259,36 @@ int continue_hypercall_on_cpu(
> >>> return 0;
> >>> }
> >>>
> >>> +
> >>> +void freeze_domains(void)
> >>
> >> Nit: No double blank lines please.
> >
> > Thanks for pointing that out! I'll fix it in the next version of the
> > patch series.
> >
> >>
> >>> +{
> >>> + struct domain *d;
> >>> +
> >>> + rcu_read_lock(&domlist_read_lock);
> >>> + /*
> >>> + * Note that we iterate in order of domain-id. Hence we will pause dom0
> >>> + * first which is required for correctness (as only dom0 can add domains to
> >>> + * the domain list). Otherwise we could miss concurrently-created domains.
> >>> + */
> >>> + for_each_domain ( d )
> >>> + domain_pause(d);
> >>> + rcu_read_unlock(&domlist_read_lock);
> >>> +
> >>> + scheduler_disable();
> >>
> >> When made generally available I'm unsure having this and ...
> >>
> >>> +}
> >>> +
> >>> +void thaw_domains(void)
> >>> +{
> >>> + struct domain *d;
> >>> +
> >>> + scheduler_enable();
> >>
> >> ... this here is a good idea. Both scheduler operations aren't related
> >> to what the function names say is being done here.
> >
> > I have just moved these functions from x86-specific headers to a common one,
> > but they are still used only for suspend/resume purposes.
> > It's not a problem for me to adjust the names slightly in the next
> > version of the
> > patch series.
>
> I wasn't after a rename really; my suggestion was to leave the scheduler
> calls at the original call sites, and remove them from here.
got it, thank you
>
> Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
2025-03-05 9:11 ` [PATCH 01/16] iommu: Add checks before calling iommu suspend/resume Mykola Kvach
2025-03-05 9:11 ` [PATCH 02/16] xen/x86: Move freeze/thaw_domains into common files Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-13 15:27 ` Jan Beulich
2025-03-19 16:13 ` Grygorii Strashko
2025-03-05 9:11 ` [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64 Mykola Kvach
` (14 subsequent siblings)
17 siblings, 2 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Dario Faggioli, Juergen Gross, George Dunlap,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
Introduce a separate struct for watchdog timers. It is needed to properly
implement the suspend/resume actions for the watchdog timers. To be able
to restart watchdog timer after suspend we need to remember their
frequency somewhere. To not bloat the struct timer a new struct
watchdog_timer is introduced, containing the original timer and the last
set timeout.
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
This commit was introduced in patch series V2.
---
xen/common/keyhandler.c | 2 +-
xen/common/sched/core.c | 11 ++++++-----
xen/include/xen/sched.h | 3 ++-
xen/include/xen/watchdog.h | 6 ++++++
4 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 0bb842ec00..caf614c0c2 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -305,7 +305,7 @@ static void cf_check dump_domains(unsigned char key)
for ( i = 0 ; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
if ( test_bit(i, &d->watchdog_inuse_map) )
printk(" watchdog %d expires in %d seconds\n",
- i, (u32)((d->watchdog_timer[i].expires - NOW()) >> 30));
+ i, (u32)((d->watchdog_timer[i].timer.expires - NOW()) >> 30));
arch_dump_domain_info(d);
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index d6296d99fd..b1c6b6b9fa 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1556,7 +1556,8 @@ static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
{
if ( test_and_set_bit(id, &d->watchdog_inuse_map) )
continue;
- set_timer(&d->watchdog_timer[id], NOW() + SECONDS(timeout));
+ d->watchdog_timer[id].timeout = timeout;
+ set_timer(&d->watchdog_timer[id].timer, NOW() + SECONDS(timeout));
break;
}
spin_unlock(&d->watchdog_lock);
@@ -1572,12 +1573,12 @@ static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
if ( timeout == 0 )
{
- stop_timer(&d->watchdog_timer[id]);
+ stop_timer(&d->watchdog_timer[id].timer);
clear_bit(id, &d->watchdog_inuse_map);
}
else
{
- set_timer(&d->watchdog_timer[id], NOW() + SECONDS(timeout));
+ set_timer(&d->watchdog_timer[id].timer, NOW() + SECONDS(timeout));
}
spin_unlock(&d->watchdog_lock);
@@ -1593,7 +1594,7 @@ void watchdog_domain_init(struct domain *d)
d->watchdog_inuse_map = 0;
for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
- init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
+ init_timer(&d->watchdog_timer[i].timer, domain_watchdog_timeout, d, 0);
}
void watchdog_domain_destroy(struct domain *d)
@@ -1601,7 +1602,7 @@ void watchdog_domain_destroy(struct domain *d)
unsigned int i;
for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
- kill_timer(&d->watchdog_timer[i]);
+ kill_timer(&d->watchdog_timer[i].timer);
}
/*
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 177784e6da..d0d10612ce 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -24,6 +24,7 @@
#include <asm/current.h>
#include <xen/vpci.h>
#include <xen/wait.h>
+#include <xen/watchdog.h>
#include <public/xen.h>
#include <public/domctl.h>
#include <public/sysctl.h>
@@ -569,7 +570,7 @@ struct domain
#define NR_DOMAIN_WATCHDOG_TIMERS 2
spinlock_t watchdog_lock;
uint32_t watchdog_inuse_map;
- struct timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
+ struct watchdog_timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
struct rcu_head rcu;
diff --git a/xen/include/xen/watchdog.h b/xen/include/xen/watchdog.h
index 4c2840bd91..2b7169632d 100644
--- a/xen/include/xen/watchdog.h
+++ b/xen/include/xen/watchdog.h
@@ -8,6 +8,12 @@
#define __XEN_WATCHDOG_H__
#include <xen/types.h>
+#include <xen/timer.h>
+
+struct watchdog_timer {
+ struct timer timer;
+ uint32_t timeout;
+};
#ifdef CONFIG_WATCHDOG
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-05 9:11 ` [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers Mykola Kvach
@ 2025-03-13 15:27 ` Jan Beulich
2025-03-20 10:25 ` Mykola Kvach
2025-03-19 16:13 ` Grygorii Strashko
1 sibling, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 15:27 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mirela Simonovic, Andrew Cooper, Anthony PERARD, Michal Orzel,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Dario Faggioli, Juergen Gross, George Dunlap, Mykyta Poturai,
Mykola Kvach, xen-devel
On 05.03.2025 10:11, Mykola Kvach wrote:
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> Introduce a separate struct for watchdog timers. It is needed to properly
> implement the suspend/resume actions for the watchdog timers. To be able
> to restart watchdog timer after suspend we need to remember their
> frequency somewhere. To not bloat the struct timer a new struct
> watchdog_timer is introduced, containing the original timer and the last
> set timeout.
>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
A From: with no corresponding S-o-b: is potentially problematic. You also
can't simply add one with her agreement, though.
> ---
> This commit was introduced in patch series V2.
Yet, btw, the whole series isn't tagged with a version.
> --- a/xen/common/keyhandler.c
> +++ b/xen/common/keyhandler.c
> @@ -305,7 +305,7 @@ static void cf_check dump_domains(unsigned char key)
> for ( i = 0 ; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> if ( test_bit(i, &d->watchdog_inuse_map) )
> printk(" watchdog %d expires in %d seconds\n",
> - i, (u32)((d->watchdog_timer[i].expires - NOW()) >> 30));
> + i, (u32)((d->watchdog_timer[i].timer.expires - NOW()) >> 30));
I realize you mean to just do a mechanical replacement here, yet the use of
u32 is not only against our style (should be uint32_t then), but it's also
not clear to me that this subtraction can't ever yield a negative result.
Hence the use of %d looks more correct to me than the cast to an unsigned
type.
In any event the already long line now grows too long and hence needs
wrapping.
> @@ -569,7 +570,7 @@ struct domain
> #define NR_DOMAIN_WATCHDOG_TIMERS 2
> spinlock_t watchdog_lock;
> uint32_t watchdog_inuse_map;
> - struct timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
> + struct watchdog_timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
An alternative would be to have a separate array for the timeout values.
This would also save some space, seeing that on 64-bit arches you
introduce 32 bits of tail padding in the struct.
If we go the struct watchdog_timer route, may I at least suggest to rename
the field to just "watchdog", so things like &d->watchdog_timer[i].timer
don't say "timer" twice?
> --- a/xen/include/xen/watchdog.h
> +++ b/xen/include/xen/watchdog.h
> @@ -8,6 +8,12 @@
> #define __XEN_WATCHDOG_H__
>
> #include <xen/types.h>
> +#include <xen/timer.h>
> +
> +struct watchdog_timer {
> + struct timer timer;
> + uint32_t timeout;
This wants a brief comment mentioning the granularity.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-13 15:27 ` Jan Beulich
@ 2025-03-20 10:25 ` Mykola Kvach
2025-03-20 10:31 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-20 10:25 UTC (permalink / raw)
To: Jan Beulich
Cc: Mirela Simonovic, Andrew Cooper, Anthony PERARD, Michal Orzel,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Dario Faggioli, Juergen Gross, George Dunlap, Mykyta Poturai,
Mykola Kvach, xen-devel
Hi,
On Thu, Mar 13, 2025 at 5:27 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2025 10:11, Mykola Kvach wrote:
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > Introduce a separate struct for watchdog timers. It is needed to properly
> > implement the suspend/resume actions for the watchdog timers. To be able
> > to restart watchdog timer after suspend we need to remember their
> > frequency somewhere. To not bloat the struct timer a new struct
> > watchdog_timer is introduced, containing the original timer and the last
> > set timeout.
> >
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>
> A From: with no corresponding S-o-b: is potentially problematic. You also
> can't simply add one with her agreement, though.
Thank you for pointing that out! I'll revisit all commits and add the missing
Signed-off-by tags in the next version of patch series.
>
> > ---
> > This commit was introduced in patch series V2.
>
> Yet, btw, the whole series isn't tagged with a version.
Yes, I added a description of the versions in the cover letter and
followed the style
used in version 2 meaning I avoided using tags. Since years have passed between
the patch series, I thought including tags might confuse reviewers.
If you want I'll add a correct tag in the next version of this patch series,
i.e. V4 instead of V2.
>
> > --- a/xen/common/keyhandler.c
> > +++ b/xen/common/keyhandler.c
> > @@ -305,7 +305,7 @@ static void cf_check dump_domains(unsigned char key)
> > for ( i = 0 ; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > if ( test_bit(i, &d->watchdog_inuse_map) )
> > printk(" watchdog %d expires in %d seconds\n",
> > - i, (u32)((d->watchdog_timer[i].expires - NOW()) >> 30));
> > + i, (u32)((d->watchdog_timer[i].timer.expires - NOW()) >> 30));
>
> I realize you mean to just do a mechanical replacement here, yet the use of
> u32 is not only against our style (should be uint32_t then), but it's also
> not clear to me that this subtraction can't ever yield a negative result.
> Hence the use of %d looks more correct to me than the cast to an unsigned
> type.
>
> In any event the already long line now grows too long and hence needs
> wrapping.
Maybe it would be better to send a separate patch for this. I'm not sure if such
changes are needed within the scope of this patch series
>
> > @@ -569,7 +570,7 @@ struct domain
> > #define NR_DOMAIN_WATCHDOG_TIMERS 2
> > spinlock_t watchdog_lock;
> > uint32_t watchdog_inuse_map;
> > - struct timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
> > + struct watchdog_timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
>
> An alternative would be to have a separate array for the timeout values.
> This would also save some space, seeing that on 64-bit arches you
> introduce 32 bits of tail padding in the struct.
Maybe it will be enough to leave it as is and only change the order of
the timeout
value and the timer. This way, we will avoid potential padding issues
and still get
the benefits of using a single struct.
>
> If we go the struct watchdog_timer route, may I at least suggest to rename
> the field to just "watchdog", so things like &d->watchdog_timer[i].timer
> don't say "timer" twice?
I agree, I'll change the name of the fields to avoid duplication.
>
> > --- a/xen/include/xen/watchdog.h
> > +++ b/xen/include/xen/watchdog.h
> > @@ -8,6 +8,12 @@
> > #define __XEN_WATCHDOG_H__
> >
> > #include <xen/types.h>
> > +#include <xen/timer.h>
> > +
> > +struct watchdog_timer {
> > + struct timer timer;
> > + uint32_t timeout;
>
> This wants a brief comment mentioning the granularity.
Thanks for pointing that out, I'll add a comment.
>
> Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-20 10:25 ` Mykola Kvach
@ 2025-03-20 10:31 ` Jan Beulich
2025-03-20 10:33 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-20 10:31 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mirela Simonovic, Andrew Cooper, Anthony PERARD, Michal Orzel,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Dario Faggioli, Juergen Gross, George Dunlap, Mykyta Poturai,
Mykola Kvach, xen-devel
On 20.03.2025 11:25, Mykola Kvach wrote:
> On Thu, Mar 13, 2025 at 5:27 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>>>
>>> Introduce a separate struct for watchdog timers. It is needed to properly
>>> implement the suspend/resume actions for the watchdog timers. To be able
>>> to restart watchdog timer after suspend we need to remember their
>>> frequency somewhere. To not bloat the struct timer a new struct
>>> watchdog_timer is introduced, containing the original timer and the last
>>> set timeout.
>>>
>>> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>
>> A From: with no corresponding S-o-b: is potentially problematic. You also
>> can't simply add one with her agreement, though.
>
> Thank you for pointing that out! I'll revisit all commits and add the missing
> Signed-off-by tags in the next version of patch series.
Ftaod - you may not add anyone's S-o-b without their agreement.
>>> ---
>>> This commit was introduced in patch series V2.
>>
>> Yet, btw, the whole series isn't tagged with a version.
>
> Yes, I added a description of the versions in the cover letter and
> followed the style
> used in version 2 meaning I avoided using tags. Since years have passed between
> the patch series, I thought including tags might confuse reviewers.
> If you want I'll add a correct tag in the next version of this patch series,
> i.e. V4 instead of V2.
Yes, no matter how much time has passed, versioning is helpful and
meaningful.
>>> --- a/xen/common/keyhandler.c
>>> +++ b/xen/common/keyhandler.c
>>> @@ -305,7 +305,7 @@ static void cf_check dump_domains(unsigned char key)
>>> for ( i = 0 ; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
>>> if ( test_bit(i, &d->watchdog_inuse_map) )
>>> printk(" watchdog %d expires in %d seconds\n",
>>> - i, (u32)((d->watchdog_timer[i].expires - NOW()) >> 30));
>>> + i, (u32)((d->watchdog_timer[i].timer.expires - NOW()) >> 30));
>>
>> I realize you mean to just do a mechanical replacement here, yet the use of
>> u32 is not only against our style (should be uint32_t then), but it's also
>> not clear to me that this subtraction can't ever yield a negative result.
>> Hence the use of %d looks more correct to me than the cast to an unsigned
>> type.
>>
>> In any event the already long line now grows too long and hence needs
>> wrapping.
>
> Maybe it would be better to send a separate patch for this. I'm not sure if such
> changes are needed within the scope of this patch series
Simple style adjustments on lines touched anyway are generally fine. That's
better than having individual (huge) patches adjusting only style, at the
very least from a "git blame" perspective. And when avoiding such, moving
towards more modern style can only be achieved if code being touched anyway
is getting modernized at such occasions.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-20 10:31 ` Jan Beulich
@ 2025-03-20 10:33 ` Jan Beulich
0 siblings, 0 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-20 10:33 UTC (permalink / raw)
To: Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Dario Faggioli,
Juergen Gross, George Dunlap, Mykyta Poturai, Mykola Kvach,
xen-devel
On 20.03.2025 11:31, Jan Beulich wrote:
> On 20.03.2025 11:25, Mykola Kvach wrote:
>> On Thu, Mar 13, 2025 at 5:27 PM Jan Beulich <jbeulich@suse.com> wrote:
>>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>>> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>>>>
>>>> Introduce a separate struct for watchdog timers. It is needed to properly
>>>> implement the suspend/resume actions for the watchdog timers. To be able
>>>> to restart watchdog timer after suspend we need to remember their
>>>> frequency somewhere. To not bloat the struct timer a new struct
>>>> watchdog_timer is introduced, containing the original timer and the last
>>>> set timeout.
>>>>
>>>> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
>>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>>
>>> A From: with no corresponding S-o-b: is potentially problematic. You also
>>> can't simply add one with her agreement, though.
>>
>> Thank you for pointing that out! I'll revisit all commits and add the missing
>> Signed-off-by tags in the next version of patch series.
>
> Ftaod - you may not add anyone's S-o-b without their agreement.
Oh, and it would help if you could avoid submitting patches with invalid
email addresses in Cc:. Everyone replying will then experience delivery
failures.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-05 9:11 ` [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers Mykola Kvach
2025-03-13 15:27 ` Jan Beulich
@ 2025-03-19 16:13 ` Grygorii Strashko
2025-03-20 10:33 ` Mykola Kvach
1 sibling, 1 reply; 69+ messages in thread
From: Grygorii Strashko @ 2025-03-19 16:13 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mirela Simonovic, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Dario Faggioli, Juergen Gross, George Dunlap,
Mykyta Poturai, Mykola Kvach
On 05.03.25 11:11, Mykola Kvach wrote:
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> Introduce a separate struct for watchdog timers. It is needed to properly
> implement the suspend/resume actions for the watchdog timers. To be able
> to restart watchdog timer after suspend we need to remember their
> frequency somewhere. To not bloat the struct timer a new struct
> watchdog_timer is introduced, containing the original timer and the last
> set timeout.
>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> This commit was introduced in patch series V2.
> ---
> xen/common/keyhandler.c | 2 +-
> xen/common/sched/core.c | 11 ++++++-----
> xen/include/xen/sched.h | 3 ++-
> xen/include/xen/watchdog.h | 6 ++++++
> 4 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
> index 0bb842ec00..caf614c0c2 100644
> --- a/xen/common/keyhandler.c
> +++ b/xen/common/keyhandler.c
> @@ -305,7 +305,7 @@ static void cf_check dump_domains(unsigned char key)
> for ( i = 0 ; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> if ( test_bit(i, &d->watchdog_inuse_map) )
> printk(" watchdog %d expires in %d seconds\n",
> - i, (u32)((d->watchdog_timer[i].expires - NOW()) >> 30));
> + i, (u32)((d->watchdog_timer[i].timer.expires - NOW()) >> 30));
I'd like to propose to add watchdog API wrapper here, like
watchdog_domain_expires_sec(d,id)
or
watchdog_domain_dump(d)
and so hide implementation internals.
>
> arch_dump_domain_info(d);
>
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index d6296d99fd..b1c6b6b9fa 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1556,7 +1556,8 @@ static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
> {
> if ( test_and_set_bit(id, &d->watchdog_inuse_map) )
> continue;
> - set_timer(&d->watchdog_timer[id], NOW() + SECONDS(timeout));
> + d->watchdog_timer[id].timeout = timeout;
> + set_timer(&d->watchdog_timer[id].timer, NOW() + SECONDS(timeout));
> break;
> }
> spin_unlock(&d->watchdog_lock);
> @@ -1572,12 +1573,12 @@ static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
>
> if ( timeout == 0 )
> {
> - stop_timer(&d->watchdog_timer[id]);
> + stop_timer(&d->watchdog_timer[id].timer);
> clear_bit(id, &d->watchdog_inuse_map);
> }
> else
> {
> - set_timer(&d->watchdog_timer[id], NOW() + SECONDS(timeout));
> + set_timer(&d->watchdog_timer[id].timer, NOW() + SECONDS(timeout));
> }
>
> spin_unlock(&d->watchdog_lock);
> @@ -1593,7 +1594,7 @@ void watchdog_domain_init(struct domain *d)
> d->watchdog_inuse_map = 0;
>
> for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
> + init_timer(&d->watchdog_timer[i].timer, domain_watchdog_timeout, d, 0);
> }
>
> void watchdog_domain_destroy(struct domain *d)
> @@ -1601,7 +1602,7 @@ void watchdog_domain_destroy(struct domain *d)
> unsigned int i;
>
> for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> - kill_timer(&d->watchdog_timer[i]);
> + kill_timer(&d->watchdog_timer[i].timer);
> }
>
> /*
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 177784e6da..d0d10612ce 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -24,6 +24,7 @@
> #include <asm/current.h>
> #include <xen/vpci.h>
> #include <xen/wait.h>
> +#include <xen/watchdog.h>
> #include <public/xen.h>
> #include <public/domctl.h>
> #include <public/sysctl.h>
I think struct watchdog_timer (or whatever you going to add) need to be moved in sched.h
because...
> @@ -569,7 +570,7 @@ struct domain
> #define NR_DOMAIN_WATCHDOG_TIMERS 2
> spinlock_t watchdog_lock;
> uint32_t watchdog_inuse_map;
> - struct timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
> + struct watchdog_timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
>
> struct rcu_head rcu;
>
> diff --git a/xen/include/xen/watchdog.h b/xen/include/xen/watchdog.h
> index 4c2840bd91..2b7169632d 100644
> --- a/xen/include/xen/watchdog.h
> +++ b/xen/include/xen/watchdog.h
> @@ -8,6 +8,12 @@
> #define __XEN_WATCHDOG_H__
>
> #include <xen/types.h>
> +#include <xen/timer.h>
...this interface is not related to domain's watchdogs.
From x86 code, it seems like some sort of HW watchdog used to check pCPUs state
and not domains/vcpu. And it's Not enabled for Arm now.
> +
> +struct watchdog_timer {
> + struct timer timer;
> + uint32_t timeout;
> +};
>
> #ifdef CONFIG_WATCHDOG
>
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers
2025-03-19 16:13 ` Grygorii Strashko
@ 2025-03-20 10:33 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-20 10:33 UTC (permalink / raw)
To: Grygorii Strashko
Cc: xen-devel, Mirela Simonovic, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Dario Faggioli, Juergen Gross, George Dunlap,
Mykyta Poturai, Mykola Kvach
Hi,
On Wed, Mar 19, 2025 at 6:14 PM Grygorii Strashko
<grygorii_strashko@epam.com> wrote:
>
>
>
> On 05.03.25 11:11, Mykola Kvach wrote:
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > Introduce a separate struct for watchdog timers. It is needed to properly
> > implement the suspend/resume actions for the watchdog timers. To be able
> > to restart watchdog timer after suspend we need to remember their
> > frequency somewhere. To not bloat the struct timer a new struct
> > watchdog_timer is introduced, containing the original timer and the last
> > set timeout.
> >
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > This commit was introduced in patch series V2.
> > ---
> > xen/common/keyhandler.c | 2 +-
> > xen/common/sched/core.c | 11 ++++++-----
> > xen/include/xen/sched.h | 3 ++-
> > xen/include/xen/watchdog.h | 6 ++++++
> > 4 files changed, 15 insertions(+), 7 deletions(-)
> >
> > diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
> > index 0bb842ec00..caf614c0c2 100644
> > --- a/xen/common/keyhandler.c
> > +++ b/xen/common/keyhandler.c
> > @@ -305,7 +305,7 @@ static void cf_check dump_domains(unsigned char key)
> > for ( i = 0 ; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > if ( test_bit(i, &d->watchdog_inuse_map) )
> > printk(" watchdog %d expires in %d seconds\n",
> > - i, (u32)((d->watchdog_timer[i].expires - NOW()) >> 30));
> > + i, (u32)((d->watchdog_timer[i].timer.expires - NOW()) >> 30));
>
> I'd like to propose to add watchdog API wrapper here, like
>
> watchdog_domain_expires_sec(d,id)
>
> or
>
> watchdog_domain_dump(d)
>
> and so hide implementation internals.
It was already proposed by Jan Beulich. I'll do it.
>
> >
> > arch_dump_domain_info(d);
> >
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index d6296d99fd..b1c6b6b9fa 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -1556,7 +1556,8 @@ static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
> > {
> > if ( test_and_set_bit(id, &d->watchdog_inuse_map) )
> > continue;
> > - set_timer(&d->watchdog_timer[id], NOW() + SECONDS(timeout));
> > + d->watchdog_timer[id].timeout = timeout;
> > + set_timer(&d->watchdog_timer[id].timer, NOW() + SECONDS(timeout));
> > break;
> > }
> > spin_unlock(&d->watchdog_lock);
> > @@ -1572,12 +1573,12 @@ static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
> >
> > if ( timeout == 0 )
> > {
> > - stop_timer(&d->watchdog_timer[id]);
> > + stop_timer(&d->watchdog_timer[id].timer);
> > clear_bit(id, &d->watchdog_inuse_map);
> > }
> > else
> > {
> > - set_timer(&d->watchdog_timer[id], NOW() + SECONDS(timeout));
> > + set_timer(&d->watchdog_timer[id].timer, NOW() + SECONDS(timeout));
> > }
> >
> > spin_unlock(&d->watchdog_lock);
> > @@ -1593,7 +1594,7 @@ void watchdog_domain_init(struct domain *d)
> > d->watchdog_inuse_map = 0;
> >
> > for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
> > + init_timer(&d->watchdog_timer[i].timer, domain_watchdog_timeout, d, 0);
> > }
> >
> > void watchdog_domain_destroy(struct domain *d)
> > @@ -1601,7 +1602,7 @@ void watchdog_domain_destroy(struct domain *d)
> > unsigned int i;
> >
> > for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > - kill_timer(&d->watchdog_timer[i]);
> > + kill_timer(&d->watchdog_timer[i].timer);
> > }
> >
> > /*
> > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > index 177784e6da..d0d10612ce 100644
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -24,6 +24,7 @@
> > #include <asm/current.h>
> > #include <xen/vpci.h>
> > #include <xen/wait.h>
> > +#include <xen/watchdog.h>
> > #include <public/xen.h>
> > #include <public/domctl.h>
> > #include <public/sysctl.h>
>
> I think struct watchdog_timer (or whatever you going to add) need to be moved in sched.h
> because...
>
> > @@ -569,7 +570,7 @@ struct domain
> > #define NR_DOMAIN_WATCHDOG_TIMERS 2
> > spinlock_t watchdog_lock;
> > uint32_t watchdog_inuse_map;
> > - struct timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
> > + struct watchdog_timer watchdog_timer[NR_DOMAIN_WATCHDOG_TIMERS];
> >
> > struct rcu_head rcu;
> >
> > diff --git a/xen/include/xen/watchdog.h b/xen/include/xen/watchdog.h
> > index 4c2840bd91..2b7169632d 100644
> > --- a/xen/include/xen/watchdog.h
> > +++ b/xen/include/xen/watchdog.h
> > @@ -8,6 +8,12 @@
> > #define __XEN_WATCHDOG_H__
> >
> > #include <xen/types.h>
> > +#include <xen/timer.h>
>
> ...this interface is not related to domain's watchdogs.
> From x86 code, it seems like some sort of HW watchdog used to check pCPUs state
> and not domains/vcpu. And it's Not enabled for Arm now.
Sorry, but maybe I missed something. However, this struct and the
previous watchdog timer
are used as fields of the domain struct and correspond to a particular
domain. Also, take a look
at some functions where the watchdog timer field is used: domain_watchdog,
watchdog_domain_init, and watchdog_domain_destroy.
I see a direct connection with a domain..
>
> > +
> > +struct watchdog_timer {
> > + struct timer timer;
> > + uint32_t timeout;
> > +};
> >
> > #ifdef CONFIG_WATCHDOG
> >
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (2 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 03/16] xen/arm: introduce a separate struct for watchdog timers Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 20:47 ` Julien Grall
2025-03-05 9:11 ` [PATCH 05/16] xen/percpu: don't initialize percpu on resume Mykola Kvach
` (13 subsequent siblings)
17 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Mykola Kvach
If we call disable_nonboot_cpus on ARM64 with system_state set
to SYS_STATE_suspend, the following assertion will be triggered:
```
(XEN) [ 25.582712] Disabling non-boot CPUs ...
(XEN) [ 25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
[...]
(XEN) [ 25.975069] Xen call trace:
(XEN) [ 25.978353] [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
(XEN) [ 25.984314] [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
(XEN) [ 25.990276] [<00000a00002747d4>] release_irq+0xe4/0xe8
(XEN) [ 25.996152] [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
(XEN) [ 26.003150] [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
(XEN) [ 26.009717] [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
(XEN) [ 26.017148] [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
(XEN) [ 26.023801] [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
(XEN) [ 26.030281] [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
(XEN) [ 26.038057] [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
(XEN) [ 26.045229] [<00000a00002268a4>] do_tasklet+0x68/0xb0
(XEN) [ 26.051018] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
(XEN) [ 26.057585] [<00000a0000277e30>] start_secondary+0x21c/0x220
(XEN) [ 26.063978] [<00000a0000361258>] 00000a0000361258
```
This happens because before invoking take_cpu_down via the stop_machine_run
function on the target CPU, stop_machine_run requests
the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
the release_irq function then triggers the assertion:
/*
* Heap allocations may need TLB flushes which may require IRQs to be
* enabled (except when only 1 PCPU is online).
*/
#define ASSERT_ALLOC_CONTEXT()
This patch introduces a new tasklet to perform the CPU_DYING call chain for
a particular CPU. However, we cannot call take_cpu_down from the tasklet
because the __cpu_disable function disables local IRQs, causing the system
to crash inside spin_lock_irq, which is called after the tasklet function
invocation inside do_tasklet_work:
void _spin_lock_irq(spinlock_t *lock)
{
ASSERT(local_irq_is_enabled());
To resolve this, take_cpu_down is split into two parts. The first part triggers
the CPU_DYING call chain, while the second part, __cpu_disable, is invoked from
stop_machine_run.
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
This patch was introduced in patch series V3.
---
xen/common/cpu.c | 43 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/xen/common/cpu.c b/xen/common/cpu.c
index f09af0444b..99d4c0c579 100644
--- a/xen/common/cpu.c
+++ b/xen/common/cpu.c
@@ -48,6 +48,10 @@ const unsigned long cpu_bit_bitmap[BITS_PER_LONG+1][BITS_TO_LONGS(NR_CPUS)] = {
static DEFINE_RWLOCK(cpu_add_remove_lock);
+#ifdef CONFIG_ARM_64
+static DEFINE_PER_CPU(struct tasklet, cpu_down_tasklet);
+#endif
+
bool get_cpu_maps(void)
{
return read_trylock(&cpu_add_remove_lock);
@@ -101,6 +105,14 @@ static void cf_check _take_cpu_down(void *unused)
__cpu_disable();
}
+#ifdef CONFIG_ARM_64
+static int cf_check cpu_disable_stop_machine(void *unused)
+{
+ __cpu_disable();
+ return 0;
+}
+#endif
+
static int cf_check take_cpu_down(void *arg)
{
_take_cpu_down(arg);
@@ -128,6 +140,14 @@ int cpu_down(unsigned int cpu)
if ( system_state < SYS_STATE_active || system_state == SYS_STATE_resume )
on_selected_cpus(cpumask_of(cpu), _take_cpu_down, NULL, true);
+#ifdef CONFIG_ARM_64
+ else if ( system_state == SYS_STATE_suspend )
+ {
+ tasklet_schedule_on_cpu(&per_cpu(cpu_down_tasklet, cpu), cpu);
+ if ( (err = stop_machine_run(cpu_disable_stop_machine, NULL, cpu)) < 0 )
+ goto fail;
+ }
+#endif
else if ( (err = stop_machine_run(take_cpu_down, NULL, cpu)) < 0 )
goto fail;
@@ -247,3 +267,26 @@ void enable_nonboot_cpus(void)
cpumask_clear(&frozen_cpus);
}
+
+#ifdef CONFIG_ARM_64
+
+static void cf_check cpu_down_t_action(void *unused)
+{
+ cpu_notifier_call_chain(smp_processor_id(), CPU_DYING, NULL, true);
+}
+
+static int __init init_cpu_down_tasklet(void)
+{
+ unsigned int cpu;
+
+ for_each_possible_cpu(cpu) {
+ struct tasklet *t = &per_cpu(cpu_down_tasklet, cpu);
+
+ tasklet_init(t, cpu_down_t_action, NULL);
+ }
+
+ return 0;
+}
+__initcall(init_cpu_down_tasklet);
+
+#endif /* CONFIG_ARM_64 */
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64
2025-03-05 9:11 ` [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64 Mykola Kvach
@ 2025-03-11 20:47 ` Julien Grall
2025-03-13 15:42 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 20:47 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Roger Pau Monné, Stefano Stabellini, Mykola Kvach
Hi Mykola,
On 05/03/2025 09:11, Mykola Kvach wrote:
> If we call disable_nonboot_cpus on ARM64 with system_state set
> to SYS_STATE_suspend, the following assertion will be triggered:
>
> ```
> (XEN) [ 25.582712] Disabling non-boot CPUs ...
> (XEN) [ 25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
> [...]
> (XEN) [ 25.975069] Xen call trace:
> (XEN) [ 25.978353] [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
> (XEN) [ 25.984314] [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
> (XEN) [ 25.990276] [<00000a00002747d4>] release_irq+0xe4/0xe8
> (XEN) [ 25.996152] [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
> (XEN) [ 26.003150] [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
> (XEN) [ 26.009717] [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
> (XEN) [ 26.017148] [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
> (XEN) [ 26.023801] [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
> (XEN) [ 26.030281] [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
> (XEN) [ 26.038057] [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
> (XEN) [ 26.045229] [<00000a00002268a4>] do_tasklet+0x68/0xb0
> (XEN) [ 26.051018] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
> (XEN) [ 26.057585] [<00000a0000277e30>] start_secondary+0x21c/0x220
> (XEN) [ 26.063978] [<00000a0000361258>] 00000a0000361258
> ```
>
> This happens because before invoking take_cpu_down via the stop_machine_run
> function on the target CPU, stop_machine_run requests
> the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
> the release_irq function then triggers the assertion:
>
> /*
> * Heap allocations may need TLB flushes which may require IRQs to be
> * enabled (except when only 1 PCPU is online).
> */
> #define ASSERT_ALLOC_CONTEXT()
>
> This patch introduces a new tasklet to perform the CPU_DYING call chain for
> a particular CPU. However, we cannot call take_cpu_down from the tasklet
> because the __cpu_disable function disables local IRQs, causing the system
> to crash inside spin_lock_irq, which is called after the tasklet function
> invocation inside do_tasklet_work:
>
> void _spin_lock_irq(spinlock_t *lock)
> {
> ASSERT(local_irq_is_enabled());
>
> To resolve this, take_cpu_down is split into two parts. The first part triggers
> the CPU_DYING call chain, while the second part, __cpu_disable, is invoked from
> stop_machine_run.
Rather than modifying common code, have you considered allocating from
the IRQ action from the percpu area? This would also reduce the number
of possible failure when bringup up a pCPU.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64
2025-03-11 20:47 ` Julien Grall
@ 2025-03-13 15:42 ` Jan Beulich
2025-03-21 9:48 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 15:42 UTC (permalink / raw)
To: Julien Grall, Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Roger Pau Monné,
Stefano Stabellini, Mykola Kvach, xen-devel
On 11.03.2025 21:47, Julien Grall wrote:
> Hi Mykola,
>
> On 05/03/2025 09:11, Mykola Kvach wrote:
>> If we call disable_nonboot_cpus on ARM64 with system_state set
>> to SYS_STATE_suspend, the following assertion will be triggered:
>>
>> ```
>> (XEN) [ 25.582712] Disabling non-boot CPUs ...
>> (XEN) [ 25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
>> [...]
>> (XEN) [ 25.975069] Xen call trace:
>> (XEN) [ 25.978353] [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
>> (XEN) [ 25.984314] [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
>> (XEN) [ 25.990276] [<00000a00002747d4>] release_irq+0xe4/0xe8
>> (XEN) [ 25.996152] [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
>> (XEN) [ 26.003150] [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
>> (XEN) [ 26.009717] [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
>> (XEN) [ 26.017148] [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
>> (XEN) [ 26.023801] [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
>> (XEN) [ 26.030281] [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
>> (XEN) [ 26.038057] [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
>> (XEN) [ 26.045229] [<00000a00002268a4>] do_tasklet+0x68/0xb0
>> (XEN) [ 26.051018] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
>> (XEN) [ 26.057585] [<00000a0000277e30>] start_secondary+0x21c/0x220
>> (XEN) [ 26.063978] [<00000a0000361258>] 00000a0000361258
>> ```
>>
>> This happens because before invoking take_cpu_down via the stop_machine_run
>> function on the target CPU, stop_machine_run requests
>> the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
>> the release_irq function then triggers the assertion:
>>
>> /*
>> * Heap allocations may need TLB flushes which may require IRQs to be
>> * enabled (except when only 1 PCPU is online).
>> */
>> #define ASSERT_ALLOC_CONTEXT()
>>
>> This patch introduces a new tasklet to perform the CPU_DYING call chain for
>> a particular CPU. However, we cannot call take_cpu_down from the tasklet
>> because the __cpu_disable function disables local IRQs, causing the system
>> to crash inside spin_lock_irq, which is called after the tasklet function
>> invocation inside do_tasklet_work:
>>
>> void _spin_lock_irq(spinlock_t *lock)
>> {
>> ASSERT(local_irq_is_enabled());
>>
>> To resolve this, take_cpu_down is split into two parts. The first part triggers
>> the CPU_DYING call chain, while the second part, __cpu_disable, is invoked from
>> stop_machine_run.
>
> Rather than modifying common code, have you considered allocating from
> the IRQ action from the percpu area? This would also reduce the number
> of possible failure when bringup up a pCPU.
I'd go further and question whether release_irq() really wants calling when
suspending. At least on x86, a requirement is that upon resume the same
number and kinds of CPUs will come back up. Hence the system will look the
same, including all the interrupts that are in use. Plus resume will be
faster if things are left set up during suspend.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64
2025-03-13 15:42 ` Jan Beulich
@ 2025-03-21 9:48 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:48 UTC (permalink / raw)
To: Jan Beulich
Cc: Julien Grall, Andrew Cooper, Anthony PERARD, Michal Orzel,
Roger Pau Monné, Stefano Stabellini, Mykola Kvach, xen-devel
On Thu, Mar 13, 2025 at 5:43 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 11.03.2025 21:47, Julien Grall wrote:
> > Hi Mykola,
> >
> > On 05/03/2025 09:11, Mykola Kvach wrote:
> >> If we call disable_nonboot_cpus on ARM64 with system_state set
> >> to SYS_STATE_suspend, the following assertion will be triggered:
> >>
> >> ```
> >> (XEN) [ 25.582712] Disabling non-boot CPUs ...
> >> (XEN) [ 25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
> >> [...]
> >> (XEN) [ 25.975069] Xen call trace:
> >> (XEN) [ 25.978353] [<00000a000022e098>] xfree+0x130/0x1a4 (PC)
> >> (XEN) [ 25.984314] [<00000a000022e08c>] xfree+0x124/0x1a4 (LR)
> >> (XEN) [ 25.990276] [<00000a00002747d4>] release_irq+0xe4/0xe8
> >> (XEN) [ 25.996152] [<00000a0000278588>] time.c#cpu_time_callback+0x44/0x60
> >> (XEN) [ 26.003150] [<00000a000021d678>] notifier_call_chain+0x7c/0xa0
> >> (XEN) [ 26.009717] [<00000a00002018e0>] cpu.c#cpu_notifier_call_chain+0x24/0x48
> >> (XEN) [ 26.017148] [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34
> >> (XEN) [ 26.023801] [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18
> >> (XEN) [ 26.030281] [<00000a0000225c5c>] stop_machine.c#stopmachine_action+0xbc/0xe4
> >> (XEN) [ 26.038057] [<00000a00002264bc>] tasklet.c#do_tasklet_work+0xb8/0x100
> >> (XEN) [ 26.045229] [<00000a00002268a4>] do_tasklet+0x68/0xb0
> >> (XEN) [ 26.051018] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
> >> (XEN) [ 26.057585] [<00000a0000277e30>] start_secondary+0x21c/0x220
> >> (XEN) [ 26.063978] [<00000a0000361258>] 00000a0000361258
> >> ```
> >>
> >> This happens because before invoking take_cpu_down via the stop_machine_run
> >> function on the target CPU, stop_machine_run requests
> >> the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in
> >> the release_irq function then triggers the assertion:
> >>
> >> /*
> >> * Heap allocations may need TLB flushes which may require IRQs to be
> >> * enabled (except when only 1 PCPU is online).
> >> */
> >> #define ASSERT_ALLOC_CONTEXT()
> >>
> >> This patch introduces a new tasklet to perform the CPU_DYING call chain for
> >> a particular CPU. However, we cannot call take_cpu_down from the tasklet
> >> because the __cpu_disable function disables local IRQs, causing the system
> >> to crash inside spin_lock_irq, which is called after the tasklet function
> >> invocation inside do_tasklet_work:
> >>
> >> void _spin_lock_irq(spinlock_t *lock)
> >> {
> >> ASSERT(local_irq_is_enabled());
> >>
> >> To resolve this, take_cpu_down is split into two parts. The first part triggers
> >> the CPU_DYING call chain, while the second part, __cpu_disable, is invoked from
> >> stop_machine_run.
> >
> > Rather than modifying common code, have you considered allocating from
> > the IRQ action from the percpu area? This would also reduce the number
> > of possible failure when bringup up a pCPU.
>
> I'd go further and question whether release_irq() really wants calling when
> suspending. At least on x86, a requirement is that upon resume the same
> number and kinds of CPUs will come back up. Hence the system will look the
> same, including all the interrupts that are in use. Plus resume will be
> faster if things are left set up during suspend.
I tried that approach and encountered another issue.
- in the case of the hardware domain, it triggered a domain watchdog;
- in the case of domU, it caused a crash inside the Linux kernel due
to an RCU stall.
Both scenarios suggest that something is wrong with IRQ delivery to the guest.
It might be necessary to revisit the entire logic related to GIC
resume/suspend instead.
>
> Jan
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (3 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64 Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 20:59 ` Julien Grall
2025-03-13 15:57 ` Jan Beulich
2025-03-05 9:11 ` [PATCH 06/16] xen/arm: Introduce system suspend config option Mykola Kvach
` (12 subsequent siblings)
17 siblings, 2 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Mykyta Poturai, Mykola Kvach
Invocation of the CPU_UP_PREPARE notification
on ARM64 during resume causes a crash:
(XEN) [ 315.807606] Error bringing CPU1 up: -16
(XEN) [ 315.811926] Xen BUG at common/cpu.c:258
[...]
(XEN) [ 316.142765] Xen call trace:
(XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
(XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
(XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
(XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
(XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
(XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
(XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
(XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
(XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
INVALID_PERCPU_AREA depends solely on the system state.
If the system is suspended, this area is not freed, and during resume, an error
occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
set and park_offline_cpus remains 0:
if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
return park_offline_cpus ? 0 : -EBUSY;
It appears that the same crash can occur on x86 if park_offline_cpus is set
to 0 during Xen suspend.
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
These changes were introduced in V2 inside
"xen: don't free percpu areas during suspend" patch.
---
xen/common/percpu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/xen/common/percpu.c b/xen/common/percpu.c
index e4e8b7bcab..83dca7edd6 100644
--- a/xen/common/percpu.c
+++ b/xen/common/percpu.c
@@ -74,7 +74,8 @@ static int cf_check cpu_percpu_callback(
switch ( action )
{
case CPU_UP_PREPARE:
- rc = init_percpu_area(cpu);
+ if ( system_state != SYS_STATE_resume )
+ rc = init_percpu_area(cpu);
break;
case CPU_UP_CANCELED:
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-05 9:11 ` [PATCH 05/16] xen/percpu: don't initialize percpu on resume Mykola Kvach
@ 2025-03-11 20:59 ` Julien Grall
2025-03-13 15:54 ` Jan Beulich
2025-03-13 15:57 ` Jan Beulich
1 sibling, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 20:59 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Roger Pau Monné, Stefano Stabellini, Mykyta Poturai,
Mykola Kvach, Juergen Gross
(+ Juergen)
Hi Mykola,
On 05/03/2025 09:11, Mykola Kvach wrote:
> Invocation of the CPU_UP_PREPARE notification
> on ARM64 during resume causes a crash:
>
> (XEN) [ 315.807606] Error bringing CPU1 up: -16
> (XEN) [ 315.811926] Xen BUG at common/cpu.c:258
> [...]
> (XEN) [ 316.142765] Xen call trace:
> (XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
> (XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
> (XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
> (XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
> (XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
> (XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
> (XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
> (XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
> (XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
>
> Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
> only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
> On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
> INVALID_PERCPU_AREA depends solely on the system state.
>
> If the system is suspended, this area is not freed, and during resume, an error
> occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
> set and park_offline_cpus remains 0:
>
> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
> return park_offline_cpus ? 0 : -EBUSY;
>
> It appears that the same crash can occur on x86 if park_offline_cpus is set
> to 0 during Xen suspend.
I am rather confused. Looking at the x86 code, it seems
park_offline_cpus is cleared for AMD platforms. So are you saying the
suspend/resume doesn't work on AMD?
I am also CCing Juergen because he originally add the check to the
system_state in 2019. Maybe he will remember why CPU_UP_PREPARE was not
changed.
>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> These changes were introduced in V2 inside
> "xen: don't free percpu areas during suspend" patch.
> ---
> xen/common/percpu.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/xen/common/percpu.c b/xen/common/percpu.c
> index e4e8b7bcab..83dca7edd6 100644
> --- a/xen/common/percpu.c
> +++ b/xen/common/percpu.c
> @@ -74,7 +74,8 @@ static int cf_check cpu_percpu_callback(
> switch ( action )
> {
> case CPU_UP_PREPARE:
> - rc = init_percpu_area(cpu);
> + if ( system_state != SYS_STATE_resume )
> + rc = init_percpu_area(cpu);
> break;
>
> case CPU_UP_CANCELED:
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-11 20:59 ` Julien Grall
@ 2025-03-13 15:54 ` Jan Beulich
2025-03-13 16:05 ` Jürgen Groß
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 15:54 UTC (permalink / raw)
To: Julien Grall, Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Roger Pau Monné,
Stefano Stabellini, Mykyta Poturai, Mykola Kvach, Juergen Gross,
xen-devel
On 11.03.2025 21:59, Julien Grall wrote:
> On 05/03/2025 09:11, Mykola Kvach wrote:
>> Invocation of the CPU_UP_PREPARE notification
>> on ARM64 during resume causes a crash:
>>
>> (XEN) [ 315.807606] Error bringing CPU1 up: -16
>> (XEN) [ 315.811926] Xen BUG at common/cpu.c:258
>> [...]
>> (XEN) [ 316.142765] Xen call trace:
>> (XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
>> (XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
>> (XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
>> (XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
>> (XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
>> (XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
>> (XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
>> (XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
>> (XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
>>
>> Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
>> only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
>> On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
>> INVALID_PERCPU_AREA depends solely on the system state.
>>
>> If the system is suspended, this area is not freed, and during resume, an error
>> occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
>> set and park_offline_cpus remains 0:
>>
>> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
>> return park_offline_cpus ? 0 : -EBUSY;
>>
>> It appears that the same crash can occur on x86 if park_offline_cpus is set
>> to 0 during Xen suspend.
>
> I am rather confused. Looking at the x86 code, it seems
> park_offline_cpus is cleared for AMD platforms. So are you saying the
> suspend/resume doesn't work on AMD?
Right now I can't see how it would work there. I've asked Marek for clarification
as to their users using S3 only on Intel hardware.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-13 15:54 ` Jan Beulich
@ 2025-03-13 16:05 ` Jürgen Groß
2025-03-13 16:20 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Jürgen Groß @ 2025-03-13 16:05 UTC (permalink / raw)
To: Jan Beulich, Julien Grall, Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Roger Pau Monné,
Stefano Stabellini, Mykyta Poturai, Mykola Kvach, xen-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 2518 bytes --]
On 13.03.25 16:54, Jan Beulich wrote:
> On 11.03.2025 21:59, Julien Grall wrote:
>> On 05/03/2025 09:11, Mykola Kvach wrote:
>>> Invocation of the CPU_UP_PREPARE notification
>>> on ARM64 during resume causes a crash:
>>>
>>> (XEN) [ 315.807606] Error bringing CPU1 up: -16
>>> (XEN) [ 315.811926] Xen BUG at common/cpu.c:258
>>> [...]
>>> (XEN) [ 316.142765] Xen call trace:
>>> (XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
>>> (XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
>>> (XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
>>> (XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
>>> (XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
>>> (XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
>>> (XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
>>> (XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
>>> (XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
>>>
>>> Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
>>> only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
>>> On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
>>> INVALID_PERCPU_AREA depends solely on the system state.
>>>
>>> If the system is suspended, this area is not freed, and during resume, an error
>>> occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
>>> set and park_offline_cpus remains 0:
>>>
>>> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
>>> return park_offline_cpus ? 0 : -EBUSY;
>>>
>>> It appears that the same crash can occur on x86 if park_offline_cpus is set
>>> to 0 during Xen suspend.
>>
>> I am rather confused. Looking at the x86 code, it seems
>> park_offline_cpus is cleared for AMD platforms. So are you saying the
>> suspend/resume doesn't work on AMD?
>
> Right now I can't see how it would work there. I've asked Marek for clarification
> as to their users using S3 only on Intel hardware.
>
> Jan
Seems as if this issue has been introduced with commit f75780d26b2f
("xen: move per-cpu area management into common code"). Before that
on x86 there was just:
if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
return 0;
in init_percpu_area().
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-13 16:05 ` Jürgen Groß
@ 2025-03-13 16:20 ` Jan Beulich
2025-03-21 9:48 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 16:20 UTC (permalink / raw)
To: Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Roger Pau Monné,
Stefano Stabellini, Mykyta Poturai, Mykola Kvach, xen-devel,
Jürgen Groß, Julien Grall
On 13.03.2025 17:05, Jürgen Groß wrote:
> On 13.03.25 16:54, Jan Beulich wrote:
>> On 11.03.2025 21:59, Julien Grall wrote:
>>> On 05/03/2025 09:11, Mykola Kvach wrote:
>>>> Invocation of the CPU_UP_PREPARE notification
>>>> on ARM64 during resume causes a crash:
>>>>
>>>> (XEN) [ 315.807606] Error bringing CPU1 up: -16
>>>> (XEN) [ 315.811926] Xen BUG at common/cpu.c:258
>>>> [...]
>>>> (XEN) [ 316.142765] Xen call trace:
>>>> (XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
>>>> (XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
>>>> (XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
>>>> (XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
>>>> (XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
>>>> (XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
>>>> (XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
>>>> (XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
>>>> (XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
>>>>
>>>> Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
>>>> only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
>>>> On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
>>>> INVALID_PERCPU_AREA depends solely on the system state.
>>>>
>>>> If the system is suspended, this area is not freed, and during resume, an error
>>>> occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
>>>> set and park_offline_cpus remains 0:
>>>>
>>>> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
>>>> return park_offline_cpus ? 0 : -EBUSY;
>>>>
>>>> It appears that the same crash can occur on x86 if park_offline_cpus is set
>>>> to 0 during Xen suspend.
>>>
>>> I am rather confused. Looking at the x86 code, it seems
>>> park_offline_cpus is cleared for AMD platforms. So are you saying the
>>> suspend/resume doesn't work on AMD?
>>
>> Right now I can't see how it would work there. I've asked Marek for clarification
>> as to their users using S3 only on Intel hardware.
>
> Seems as if this issue has been introduced with commit f75780d26b2f
> ("xen: move per-cpu area management into common code"). Before that
> on x86 there was just:
>
> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
> return 0;
>
> in init_percpu_area().
Ah yes. Mykola, can you then please address this by adjusting init_percpu_area(),
adding a Fixes: tag to reference the commit above?
Looking at the tags of the patch, please also make sure you clarify who's the
original author of the patch. Your S-o-b isn't first, but there's also no From:.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-13 16:20 ` Jan Beulich
@ 2025-03-21 9:48 ` Mykola Kvach
2025-03-21 13:54 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:48 UTC (permalink / raw)
To: Jan Beulich
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Roger Pau Monné,
Stefano Stabellini, Mykyta Poturai, Mykola Kvach, xen-devel,
Jürgen Groß, Julien Grall
Hi,
On Thu, Mar 13, 2025 at 6:20 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 13.03.2025 17:05, Jürgen Groß wrote:
> > On 13.03.25 16:54, Jan Beulich wrote:
> >> On 11.03.2025 21:59, Julien Grall wrote:
> >>> On 05/03/2025 09:11, Mykola Kvach wrote:
> >>>> Invocation of the CPU_UP_PREPARE notification
> >>>> on ARM64 during resume causes a crash:
> >>>>
> >>>> (XEN) [ 315.807606] Error bringing CPU1 up: -16
> >>>> (XEN) [ 315.811926] Xen BUG at common/cpu.c:258
> >>>> [...]
> >>>> (XEN) [ 316.142765] Xen call trace:
> >>>> (XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
> >>>> (XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
> >>>> (XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
> >>>> (XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
> >>>> (XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
> >>>> (XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
> >>>> (XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
> >>>> (XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
> >>>> (XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
> >>>>
> >>>> Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
> >>>> only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
> >>>> On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
> >>>> INVALID_PERCPU_AREA depends solely on the system state.
> >>>>
> >>>> If the system is suspended, this area is not freed, and during resume, an error
> >>>> occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
> >>>> set and park_offline_cpus remains 0:
> >>>>
> >>>> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
> >>>> return park_offline_cpus ? 0 : -EBUSY;
> >>>>
> >>>> It appears that the same crash can occur on x86 if park_offline_cpus is set
> >>>> to 0 during Xen suspend.
> >>>
> >>> I am rather confused. Looking at the x86 code, it seems
> >>> park_offline_cpus is cleared for AMD platforms. So are you saying the
> >>> suspend/resume doesn't work on AMD?
> >>
> >> Right now I can't see how it would work there. I've asked Marek for clarification
> >> as to their users using S3 only on Intel hardware.
> >
> > Seems as if this issue has been introduced with commit f75780d26b2f
> > ("xen: move per-cpu area management into common code"). Before that
> > on x86 there was just:
> >
> > if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
> > return 0;
> >
> > in init_percpu_area().
>
> Ah yes. Mykola, can you then please address this by adjusting init_percpu_area(),
Do I understand correctly that I should move the system_state check
inside init_percpu_area?
> adding a Fixes: tag to reference the commit above?
Sure! Should I send it as a separate patch to speed up its merging?
>
> Looking at the tags of the patch, please also make sure you clarify who's the
> original author of the patch. Your S-o-b isn't first, but there's also no From:.
ok
>
> Jan
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-21 9:48 ` Mykola Kvach
@ 2025-03-21 13:54 ` Jan Beulich
0 siblings, 0 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-21 13:54 UTC (permalink / raw)
To: Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Roger Pau Monné,
Stefano Stabellini, Mykyta Poturai, Mykola Kvach, xen-devel,
Jürgen Groß, Julien Grall
On 21.03.2025 10:48, Mykola Kvach wrote:
> Hi,
>
> On Thu, Mar 13, 2025 at 6:20 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 13.03.2025 17:05, Jürgen Groß wrote:
>>> On 13.03.25 16:54, Jan Beulich wrote:
>>>> On 11.03.2025 21:59, Julien Grall wrote:
>>>>> On 05/03/2025 09:11, Mykola Kvach wrote:
>>>>>> Invocation of the CPU_UP_PREPARE notification
>>>>>> on ARM64 during resume causes a crash:
>>>>>>
>>>>>> (XEN) [ 315.807606] Error bringing CPU1 up: -16
>>>>>> (XEN) [ 315.811926] Xen BUG at common/cpu.c:258
>>>>>> [...]
>>>>>> (XEN) [ 316.142765] Xen call trace:
>>>>>> (XEN) [ 316.146048] [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac (PC)
>>>>>> (XEN) [ 316.153219] [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac (LR)
>>>>>> (XEN) [ 316.160391] [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
>>>>>> (XEN) [ 316.167476] [<00000a0000206b70>] domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
>>>>>> (XEN) [ 316.176117] [<00000a0000226538>] tasklet.c#do_tasklet_work+0xb8/0x100
>>>>>> (XEN) [ 316.183288] [<00000a0000226920>] do_tasklet+0x68/0xb0
>>>>>> (XEN) [ 316.189077] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
>>>>>> (XEN) [ 316.195644] [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
>>>>>> (XEN) [ 316.202383] [<0000000000000008>] 0000000000000008
>>>>>>
>>>>>> Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
>>>>>> only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
>>>>>> On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
>>>>>> INVALID_PERCPU_AREA depends solely on the system state.
>>>>>>
>>>>>> If the system is suspended, this area is not freed, and during resume, an error
>>>>>> occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
>>>>>> set and park_offline_cpus remains 0:
>>>>>>
>>>>>> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
>>>>>> return park_offline_cpus ? 0 : -EBUSY;
>>>>>>
>>>>>> It appears that the same crash can occur on x86 if park_offline_cpus is set
>>>>>> to 0 during Xen suspend.
>>>>>
>>>>> I am rather confused. Looking at the x86 code, it seems
>>>>> park_offline_cpus is cleared for AMD platforms. So are you saying the
>>>>> suspend/resume doesn't work on AMD?
>>>>
>>>> Right now I can't see how it would work there. I've asked Marek for clarification
>>>> as to their users using S3 only on Intel hardware.
>>>
>>> Seems as if this issue has been introduced with commit f75780d26b2f
>>> ("xen: move per-cpu area management into common code"). Before that
>>> on x86 there was just:
>>>
>>> if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
>>> return 0;
>>>
>>> in init_percpu_area().
>>
>> Ah yes. Mykola, can you then please address this by adjusting init_percpu_area(),
>
> Do I understand correctly that I should move the system_state check
> inside init_percpu_area?
Well, I can only say as much as: To me this looks like it's the best thing you
can do, given how the code is structured right now.
>> adding a Fixes: tag to reference the commit above?
>
> Sure! Should I send it as a separate patch to speed up its merging?
Doing so may indeed help.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 05/16] xen/percpu: don't initialize percpu on resume
2025-03-05 9:11 ` [PATCH 05/16] xen/percpu: don't initialize percpu on resume Mykola Kvach
2025-03-11 20:59 ` Julien Grall
@ 2025-03-13 15:57 ` Jan Beulich
1 sibling, 0 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 15:57 UTC (permalink / raw)
To: Mykola Kvach
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Mykyta Poturai,
Mykola Kvach, xen-devel
On 05.03.2025 10:11, Mykola Kvach wrote:
> --- a/xen/common/percpu.c
> +++ b/xen/common/percpu.c
> @@ -74,7 +74,8 @@ static int cf_check cpu_percpu_callback(
> switch ( action )
> {
> case CPU_UP_PREPARE:
> - rc = init_percpu_area(cpu);
> + if ( system_state != SYS_STATE_resume )
> + rc = init_percpu_area(cpu);
> break;
>
> case CPU_UP_CANCELED:
Right now I can't see why we wouldn't need such an adjustment also for S3 on
AMD x86 hardware. However, please let's not further split how things are
being checked for. I.e. can we please keep the park_offline_cpus and the
system_state checks together, either here or in init_percpu_area()? Just
like CPU_DEAD etc handling has it.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (4 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 05/16] xen/percpu: don't initialize percpu on resume Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 22:29 ` Julien Grall
2025-03-13 15:37 ` Jan Beulich
2025-03-05 9:11 ` [PATCH 07/16] xen/char: implement suspend/resume calls for SCIF driver Mykola Kvach
` (11 subsequent siblings)
17 siblings, 2 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai
From: Mykola Kvach <mykola_kvach@epam.com>
This option enables the system suspend support. This is the
mechanism that allows the system to be suspended to RAM and
later resumed.
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
xen/arch/arm/Kconfig | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index a26d3e1182..5834af16ab 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
config ARM32_HARDEN_BRANCH_PREDICTOR
def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
+config SYSTEM_SUSPEND
+ bool "System suspend support"
+ default y
+ depends on ARM_64
+ help
+ This option enables the system suspend support. This is the
+ mechanism that allows the system to be suspended to RAM and
+ later resumed.
+
+ If unsure, say Y.
+
source "arch/arm/platforms/Kconfig"
source "common/Kconfig"
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-05 9:11 ` [PATCH 06/16] xen/arm: Introduce system suspend config option Mykola Kvach
@ 2025-03-11 22:29 ` Julien Grall
2025-03-21 9:48 ` Mykola Kvach
2025-03-13 15:37 ` Jan Beulich
1 sibling, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 22:29 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Mykyta Poturai
Hi,
On 05/03/2025 09:11, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> This option enables the system suspend support. This is the
> mechanism that allows the system to be suspended to RAM and
> later resumed.
>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> xen/arch/arm/Kconfig | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index a26d3e1182..5834af16ab 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
> config ARM32_HARDEN_BRANCH_PREDICTOR
> def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
>
> +config SYSTEM_SUSPEND
> + bool "System suspend support"
> + default y
The default should likely be no until everything is working.
> + depends on ARM_64
I think this also needs to depends on !LLC_COLORING (unless you
confirmed cache coloring is working) and UNSUPPORTED.
> + help
> + This option enables the system suspend support. This is the
> + mechanism that allows the system to be suspended to RAM and
> + later resumed.
You seem to also tie guest suspend/resunme to this option. Is it intended?
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-11 22:29 ` Julien Grall
@ 2025-03-21 9:48 ` Mykola Kvach
2025-03-21 14:58 ` Grygorii Strashko
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:48 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai
Hi,
On Wed, Mar 12, 2025 at 12:29 AM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 05/03/2025 09:11, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > This option enables the system suspend support. This is the
> > mechanism that allows the system to be suspended to RAM and
> > later resumed.
> >
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > xen/arch/arm/Kconfig | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > index a26d3e1182..5834af16ab 100644
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
> > config ARM32_HARDEN_BRANCH_PREDICTOR
> > def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
> >
> > +config SYSTEM_SUSPEND
> > + bool "System suspend support"
> > + default y
>
> The default should likely be no until everything is working.
got it!
>
> > + depends on ARM_64
>
> I think this also needs to depends on !LLC_COLORING (unless you
> confirmed cache coloring is working) and UNSUPPORTED.
Sure! I'll add the dependency.
>
> > + help
> > + This option enables the system suspend support. This is the
> > + mechanism that allows the system to be suspended to RAM and
> > + later resumed.
>
> You seem to also tie guest suspend/resunme to this option. Is it intended?
From the guest's perspective, it is a system suspend. However, it looks like the
description should be enhanced. Thank you for pointing that out.
>
> Cheers,
>
> --
> Julien Grall
>
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-21 9:48 ` Mykola Kvach
@ 2025-03-21 14:58 ` Grygorii Strashko
2025-03-24 10:18 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Grygorii Strashko @ 2025-03-21 14:58 UTC (permalink / raw)
To: Mykola Kvach, Julien Grall
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai
On 21.03.25 11:48, Mykola Kvach wrote:
> Hi,
>
> On Wed, Mar 12, 2025 at 12:29 AM Julien Grall <julien@xen.org> wrote:
>>
>> Hi,
>>
>> On 05/03/2025 09:11, Mykola Kvach wrote:
>>> From: Mykola Kvach <mykola_kvach@epam.com>
>>>
>>> This option enables the system suspend support. This is the
>>> mechanism that allows the system to be suspended to RAM and
>>> later resumed.
>>>
>>> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>> ---
>>> xen/arch/arm/Kconfig | 11 +++++++++++
>>> 1 file changed, 11 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index a26d3e1182..5834af16ab 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
>>> config ARM32_HARDEN_BRANCH_PREDICTOR
>>> def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
>>>
>>> +config SYSTEM_SUSPEND
>>> + bool "System suspend support"
>>> + default y
>>
>> The default should likely be no until everything is working.
>
> got it!
>
>>
>>> + depends on ARM_64
>>
>> I think this also needs to depends on !LLC_COLORING (unless you
>> confirmed cache coloring is working) and UNSUPPORTED.
>
> Sure! I'll add the dependency.
>
>>
>>> + help
>>> + This option enables the system suspend support. This is the
>>> + mechanism that allows the system to be suspended to RAM and
>>> + later resumed.
>>
>> You seem to also tie guest suspend/resunme to this option. Is it intended?
>
> From the guest's perspective, it is a system suspend. However, it looks like the
> description should be enhanced. Thank you for pointing that out.
s2r = "suspend to ram"
You definitely need consider and clarify ARM64 Guest System s2r and
XEN system s2r. First can be supported without second, while the XEN system s2r
depends on Guests System s2r support and required guests to be properly suspended
before allowing XEN to enter system s2r.
You can't call freeze_domains() and blindly pause some domain, because if it's not
suspend and has passed through HW which is in the middle of transaction -> DEADBEEF.
--
Best regards,
-grygorii
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-21 14:58 ` Grygorii Strashko
@ 2025-03-24 10:18 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-24 10:18 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Julien Grall, xen-devel, Mykola Kvach, Stefano Stabellini,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Mykyta Poturai
HI,
On Fri, Mar 21, 2025 at 4:58 PM Grygorii Strashko
<grygorii_strashko@epam.com> wrote:
>
>
>
> On 21.03.25 11:48, Mykola Kvach wrote:
> > Hi,
> >
> > On Wed, Mar 12, 2025 at 12:29 AM Julien Grall <julien@xen.org> wrote:
> >>
> >> Hi,
> >>
> >> On 05/03/2025 09:11, Mykola Kvach wrote:
> >>> From: Mykola Kvach <mykola_kvach@epam.com>
> >>>
> >>> This option enables the system suspend support. This is the
> >>> mechanism that allows the system to be suspended to RAM and
> >>> later resumed.
> >>>
> >>> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> >>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> >>> ---
> >>> xen/arch/arm/Kconfig | 11 +++++++++++
> >>> 1 file changed, 11 insertions(+)
> >>>
> >>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> >>> index a26d3e1182..5834af16ab 100644
> >>> --- a/xen/arch/arm/Kconfig
> >>> +++ b/xen/arch/arm/Kconfig
> >>> @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
> >>> config ARM32_HARDEN_BRANCH_PREDICTOR
> >>> def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
> >>>
> >>> +config SYSTEM_SUSPEND
> >>> + bool "System suspend support"
> >>> + default y
> >>
> >> The default should likely be no until everything is working.
> >
> > got it!
> >
> >>
> >>> + depends on ARM_64
> >>
> >> I think this also needs to depends on !LLC_COLORING (unless you
> >> confirmed cache coloring is working) and UNSUPPORTED.
> >
> > Sure! I'll add the dependency.
> >
> >>
> >>> + help
> >>> + This option enables the system suspend support. This is the
> >>> + mechanism that allows the system to be suspended to RAM and
> >>> + later resumed.
> >>
> >> You seem to also tie guest suspend/resunme to this option. Is it intended?
> >
> > From the guest's perspective, it is a system suspend. However, it looks like the
> > description should be enhanced. Thank you for pointing that out.
>
> s2r = "suspend to ram"
>
> You definitely need consider and clarify ARM64 Guest System s2r and
> XEN system s2r. First can be supported without second, while the XEN system s2r
> depends on Guests System s2r support and required guests to be properly suspended
> before allowing XEN to enter system s2r.
>
This is exactly what...
> You can't call freeze_domains() and blindly pause some domain, because if it's not
> suspend and has passed through HW which is in the middle of transaction -> DEADBEEF.
... should happen. x86 works in the same way—we call domain_pause when
performing system suspend. All domains are paused except the one that
is eligible to request system suspend.
>
> --
> Best regards,
> -grygorii
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-05 9:11 ` [PATCH 06/16] xen/arm: Introduce system suspend config option Mykola Kvach
2025-03-11 22:29 ` Julien Grall
@ 2025-03-13 15:37 ` Jan Beulich
2025-03-19 17:19 ` Grygorii Strashko
2025-03-21 9:49 ` Mykola Kvach
1 sibling, 2 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 15:37 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai, xen-devel
On 05.03.2025 10:11, Mykola Kvach wrote:
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
> config ARM32_HARDEN_BRANCH_PREDICTOR
> def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
>
> +config SYSTEM_SUSPEND
> + bool "System suspend support"
> + default y
> + depends on ARM_64
> + help
> + This option enables the system suspend support. This is the
> + mechanism that allows the system to be suspended to RAM and
> + later resumed.
> +
> + If unsure, say Y.
I wonder if something like this makes sense to place in an arch-specific
Kconfig. It's also not becoming clear here why only Arm64 would permit it.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-13 15:37 ` Jan Beulich
@ 2025-03-19 17:19 ` Grygorii Strashko
2025-03-21 9:49 ` Mykola Kvach
1 sibling, 0 replies; 69+ messages in thread
From: Grygorii Strashko @ 2025-03-19 17:19 UTC (permalink / raw)
To: Jan Beulich, Mykola Kvach
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai, xen-devel
On 13.03.25 17:37, Jan Beulich wrote:
> On 05.03.2025 10:11, Mykola Kvach wrote:
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
>> config ARM32_HARDEN_BRANCH_PREDICTOR
>> def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
>>
>> +config SYSTEM_SUSPEND
>> + bool "System suspend support"
>> + default y
>> + depends on ARM_64
>> + help
>> + This option enables the system suspend support. This is the
>> + mechanism that allows the system to be suspended to RAM and
>> + later resumed.
>> +
>> + If unsure, say Y.
>
> I wonder if something like this makes sense to place in an arch-specific
> Kconfig. It's also not becoming clear here why only Arm64 would permit it.
Taking into account that
- System suspend-2-ram support on x86 is enabled by default and It's going to be supported on ARM also;
- follow up patches are adding #ifdef CONFIG_SYSTEM_SUSPEND not only in Arm specific code;
I think, it deserve to be generic option (in some way), enabled by default on x86.
Also, Arches can declare that suspend is possible, like ARCH_SUSPEND_POSSIBLE, then
generic configs would not need to be fixed every time when System suspend-2-ram enabled
for new arch.
Personally I'd introduce separate arch/Kconfig.power (or common Kconfig.power) file
for PM options (A least there is also cpufreq/cpuidel, and could be other, PM specific things).
Best regards,
-grygorii
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-13 15:37 ` Jan Beulich
2025-03-19 17:19 ` Grygorii Strashko
@ 2025-03-21 9:49 ` Mykola Kvach
2025-03-21 14:00 ` Jan Beulich
1 sibling, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:49 UTC (permalink / raw)
To: Jan Beulich
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai, xen-devel
Hi,
On Thu, Mar 13, 2025 at 5:37 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2025 10:11, Mykola Kvach wrote:
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
> > config ARM32_HARDEN_BRANCH_PREDICTOR
> > def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
> >
> > +config SYSTEM_SUSPEND
> > + bool "System suspend support"
> > + default y
> > + depends on ARM_64
> > + help
> > + This option enables the system suspend support. This is the
> > + mechanism that allows the system to be suspended to RAM and
> > + later resumed.
> > +
> > + If unsure, say Y.
>
> I wonder if something like this makes sense to place in an arch-specific
Maybe it makes sense, but only if we are not planning to cover
suspend/resume related code for x86 as well.
> Kconfig. It's also not becoming clear here why only Arm64 would permit it.
If I understand your comment correctly, you’re suggesting that we
could use this for x86 as well. However, in that case, we would need
to make a lot of changes in other places that are not related to this
patch series, which is specifically focused on adding suspend/resume
support for Arm64. I believe that is outside the scope of this patch
series. However, this config was requested in one of the previous
patch series. The primary reason for adding this config was to reduce
the binary size for platforms where it isn’t used. I also think it can
be useful for debugging purposes, such as for identifying regressions.
As for Arm32, it’s not supported at the moment, but I hope support
will be added in the future.
>
> Jan
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/16] xen/arm: Introduce system suspend config option
2025-03-21 9:49 ` Mykola Kvach
@ 2025-03-21 14:00 ` Jan Beulich
0 siblings, 0 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-21 14:00 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Mykyta Poturai, xen-devel
On 21.03.2025 10:49, Mykola Kvach wrote:
> Hi,
>
> On Thu, Mar 13, 2025 at 5:37 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -475,6 +475,17 @@ config ARM64_HARDEN_BRANCH_PREDICTOR
>>> config ARM32_HARDEN_BRANCH_PREDICTOR
>>> def_bool y if ARM_32 && HARDEN_BRANCH_PREDICTOR
>>>
>>> +config SYSTEM_SUSPEND
>>> + bool "System suspend support"
>>> + default y
>>> + depends on ARM_64
>>> + help
>>> + This option enables the system suspend support. This is the
>>> + mechanism that allows the system to be suspended to RAM and
>>> + later resumed.
>>> +
>>> + If unsure, say Y.
>>
>> I wonder if something like this makes sense to place in an arch-specific
>
> Maybe it makes sense, but only if we are not planning to cover
> suspend/resume related code for x86 as well.
>
>> Kconfig. It's also not becoming clear here why only Arm64 would permit it.
>
> If I understand your comment correctly, you’re suggesting that we
> could use this for x86 as well.
Or PPC / RISC-V once they progress enough.
> However, in that case, we would need
> to make a lot of changes in other places that are not related to this
> patch series, which is specifically focused on adding suspend/resume
> support for Arm64. I believe that is outside the scope of this patch
> series.
Considering that - give or take bugs - S3 is working on x86, I'm not
sure what lots of changes you're thinking of. In fact ...
> However, this config was requested in one of the previous
> patch series. The primary reason for adding this config was to reduce
> the binary size for platforms where it isn’t used. I also think it can
> be useful for debugging purposes, such as for identifying regressions.
... that's what I'd see as a (future) option on x86 as well.
> As for Arm32, it’s not supported at the moment, but I hope support
> will be added in the future.
Which is another data point towards this wanting to move to common
code, with a per-arch-selected HAVE_* as dependency. To cover that it's
always-on for x86, an ..._ALWAYS_ON setting may want introducing as
well (or some shorthand approach to limit [future] churn).
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 07/16] xen/char: implement suspend/resume calls for SCIF driver
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (5 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 06/16] xen/arm: Introduce system suspend config option Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-19 17:21 ` Grygorii Strashko
2025-03-05 9:11 ` [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers Mykola Kvach
` (10 subsequent siblings)
17 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Volodymyr Babchuk,
Oleksandr Andrushchenko
From: Mykola Kvach <mykola_kvach@epam.com>
The changes have been tested only on the Renesas R-Car-H3 Starter Kit board.
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
xen/drivers/char/scif-uart.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/xen/drivers/char/scif-uart.c b/xen/drivers/char/scif-uart.c
index 757793ca45..e166fb0f36 100644
--- a/xen/drivers/char/scif-uart.c
+++ b/xen/drivers/char/scif-uart.c
@@ -139,7 +139,7 @@ static void scif_uart_interrupt(int irq, void *data)
}
}
-static void __init scif_uart_init_preirq(struct serial_port *port)
+static void scif_uart_init_preirq(struct serial_port *port)
{
struct scif_uart *uart = port->uart;
const struct port_params *params = uart->params;
@@ -271,6 +271,33 @@ static void scif_uart_stop_tx(struct serial_port *port)
scif_writew(uart, SCIF_SCSCR, scif_readw(uart, SCIF_SCSCR) & ~SCSCR_TIE);
}
+static void scif_uart_suspend(struct serial_port *port)
+{
+ struct scif_uart *uart = port->uart;
+
+ scif_uart_stop_tx(port);
+
+ /* Wait until last bit has been transmitted. */
+ while ( !(scif_readw(uart, SCIF_SCFSR) & SCFSR_TEND) );
+
+ /* Disable TX/RX parts and all interrupts */
+ scif_writew(uart, SCIF_SCSCR, 0);
+
+ /* Reset TX/RX FIFOs */
+ scif_writew(uart, SCIF_SCFCR, SCFCR_RFRST | SCFCR_TFRST);
+}
+
+static void scif_uart_resume(struct serial_port *port)
+{
+ struct scif_uart *uart = port->uart;
+
+ scif_uart_init_preirq(port);
+
+ /* Enable TX/RX and Error Interrupts */
+ scif_writew(uart, SCIF_SCSCR, scif_readw(uart, SCIF_SCSCR) |
+ SCSCR_TIE | SCSCR_RIE | SCSCR_REIE);
+}
+
static struct uart_driver __read_mostly scif_uart_driver = {
.init_preirq = scif_uart_init_preirq,
.init_postirq = scif_uart_init_postirq,
@@ -281,6 +308,8 @@ static struct uart_driver __read_mostly scif_uart_driver = {
.start_tx = scif_uart_start_tx,
.stop_tx = scif_uart_stop_tx,
.vuart_info = scif_vuart_info,
+ .suspend = scif_uart_suspend,
+ .resume = scif_uart_resume,
};
static const struct dt_device_match scif_uart_dt_match[] __initconst =
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 07/16] xen/char: implement suspend/resume calls for SCIF driver
2025-03-05 9:11 ` [PATCH 07/16] xen/char: implement suspend/resume calls for SCIF driver Mykola Kvach
@ 2025-03-19 17:21 ` Grygorii Strashko
2025-03-21 9:49 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Grygorii Strashko @ 2025-03-19 17:21 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykola Kvach, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Oleksandr Andrushchenko
On 05.03.25 11:11, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> The changes have been tested only on the Renesas R-Car-H3 Starter Kit board.
>
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> xen/drivers/char/scif-uart.c | 31 ++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/char/scif-uart.c b/xen/drivers/char/scif-uart.c
> index 757793ca45..e166fb0f36 100644
> --- a/xen/drivers/char/scif-uart.c
> +++ b/xen/drivers/char/scif-uart.c
> @@ -139,7 +139,7 @@ static void scif_uart_interrupt(int irq, void *data)
> }
> }
>
> -static void __init scif_uart_init_preirq(struct serial_port *port)
> +static void scif_uart_init_preirq(struct serial_port *port)
> {
> struct scif_uart *uart = port->uart;
> const struct port_params *params = uart->params;
> @@ -271,6 +271,33 @@ static void scif_uart_stop_tx(struct serial_port *port)
> scif_writew(uart, SCIF_SCSCR, scif_readw(uart, SCIF_SCSCR) & ~SCSCR_TIE);
> }
>
I assume you want ifdef CONFIG_SYSTEM_SUSPEND here also
> +static void scif_uart_suspend(struct serial_port *port)
> +{
> + struct scif_uart *uart = port->uart;
> +
> + scif_uart_stop_tx(port);
> +
> + /* Wait until last bit has been transmitted. */
> + while ( !(scif_readw(uart, SCIF_SCFSR) & SCFSR_TEND) );
> +
> + /* Disable TX/RX parts and all interrupts */
> + scif_writew(uart, SCIF_SCSCR, 0);
> +
> + /* Reset TX/RX FIFOs */
> + scif_writew(uart, SCIF_SCFCR, SCFCR_RFRST | SCFCR_TFRST);
> +}
> +
> +static void scif_uart_resume(struct serial_port *port)
> +{
> + struct scif_uart *uart = port->uart;
> +
> + scif_uart_init_preirq(port);
> +
> + /* Enable TX/RX and Error Interrupts */
> + scif_writew(uart, SCIF_SCSCR, scif_readw(uart, SCIF_SCSCR) |
> + SCSCR_TIE | SCSCR_RIE | SCSCR_REIE);
> +}
> +
> static struct uart_driver __read_mostly scif_uart_driver = {
> .init_preirq = scif_uart_init_preirq,
> .init_postirq = scif_uart_init_postirq,
> @@ -281,6 +308,8 @@ static struct uart_driver __read_mostly scif_uart_driver = {
> .start_tx = scif_uart_start_tx,
> .stop_tx = scif_uart_stop_tx,
> .vuart_info = scif_vuart_info,
> + .suspend = scif_uart_suspend,
> + .resume = scif_uart_resume,
> };
>
> static const struct dt_device_match scif_uart_dt_match[] __initconst =
--
Best regards,
-grygorii
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 07/16] xen/char: implement suspend/resume calls for SCIF driver
2025-03-19 17:21 ` Grygorii Strashko
@ 2025-03-21 9:49 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:49 UTC (permalink / raw)
To: Grygorii Strashko
Cc: xen-devel, Mykola Kvach, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Oleksandr Andrushchenko
Hi,
On Wed, Mar 19, 2025 at 7:21 PM Grygorii Strashko
<grygorii_strashko@epam.com> wrote:
>
>
>
> On 05.03.25 11:11, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > The changes have been tested only on the Renesas R-Car-H3 Starter Kit board.
> >
> > Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> > Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > xen/drivers/char/scif-uart.c | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/drivers/char/scif-uart.c b/xen/drivers/char/scif-uart.c
> > index 757793ca45..e166fb0f36 100644
> > --- a/xen/drivers/char/scif-uart.c
> > +++ b/xen/drivers/char/scif-uart.c
> > @@ -139,7 +139,7 @@ static void scif_uart_interrupt(int irq, void *data)
> > }
> > }
> >
> > -static void __init scif_uart_init_preirq(struct serial_port *port)
> > +static void scif_uart_init_preirq(struct serial_port *port)
> > {
> > struct scif_uart *uart = port->uart;
> > const struct port_params *params = uart->params;
> > @@ -271,6 +271,33 @@ static void scif_uart_stop_tx(struct serial_port *port)
> > scif_writew(uart, SCIF_SCSCR, scif_readw(uart, SCIF_SCSCR) & ~SCSCR_TIE);
> > }
> >
>
> I assume you want ifdef CONFIG_SYSTEM_SUSPEND here also
I was thinking about it and decided that since we have the
suspend/resume fields uncovered for uart_driver,
I'll leave the functions uncovered as well.
I'll add coverage in the next patch series.
It’s probably needed to cover the suspend/resume fields too, but that
will require additional changes for other UART drivers.
Thank you for pointing that out.
>
>
> > +static void scif_uart_suspend(struct serial_port *port)
> > +{
> > + struct scif_uart *uart = port->uart;
> > +
> > + scif_uart_stop_tx(port);
> > +
> > + /* Wait until last bit has been transmitted. */
> > + while ( !(scif_readw(uart, SCIF_SCFSR) & SCFSR_TEND) );
> > +
> > + /* Disable TX/RX parts and all interrupts */
> > + scif_writew(uart, SCIF_SCSCR, 0);
> > +
> > + /* Reset TX/RX FIFOs */
> > + scif_writew(uart, SCIF_SCFCR, SCFCR_RFRST | SCFCR_TFRST);
> > +}
> > +
> > +static void scif_uart_resume(struct serial_port *port)
> > +{
> > + struct scif_uart *uart = port->uart;
> > +
> > + scif_uart_init_preirq(port);
> > +
> > + /* Enable TX/RX and Error Interrupts */
> > + scif_writew(uart, SCIF_SCSCR, scif_readw(uart, SCIF_SCSCR) |
> > + SCSCR_TIE | SCSCR_RIE | SCSCR_REIE);
> > +}
> > +
> > static struct uart_driver __read_mostly scif_uart_driver = {
> > .init_preirq = scif_uart_init_preirq,
> > .init_postirq = scif_uart_init_postirq,
> > @@ -281,6 +308,8 @@ static struct uart_driver __read_mostly scif_uart_driver = {
> > .start_tx = scif_uart_start_tx,
> > .stop_tx = scif_uart_stop_tx,
> > .vuart_info = scif_vuart_info,
> > + .suspend = scif_uart_suspend,
> > + .resume = scif_uart_resume,
> > };
> >
> > static const struct dt_device_match scif_uart_dt_match[] __initconst =
>
> --
> Best regards,
> -grygorii
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (6 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 07/16] xen/char: implement suspend/resume calls for SCIF driver Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 21:04 ` Julien Grall
` (2 more replies)
2025-03-05 9:11 ` [PATCH 09/16] xen/arm: add suspend and resume timer helpers Mykola Kvach
` (9 subsequent siblings)
17 siblings, 3 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Mirela Simonovic, Saeed Nowshadi, Mykyta Poturai
From: Mykola Kvach <mykola_kvach@epam.com>
This patch implements suspend/resume helpers for the watchdog.
While a domain is suspended its watchdogs must be paused. Otherwise,
if the domain stays in the suspend state for a longer period of time
compared to the watchdog period, the domain would be shutdown on resume.
Proper solution to this problem is to stop (suspend) the watchdog timers
after the domain suspends and to restart (resume) the watchdog timers
before the domain resumes. The suspend/resume of watchdog timers is done
in Xen and is invisible to the guests.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in v3:
- cover the code with CONFIG_SYSTEM_SUSPEND
Changes in v2:
- drop suspended field from timer structure
- drop the call of watchdog_domain_resume from ctxt_switch_to
---
xen/common/sched/core.c | 39 +++++++++++++++++++++++++++++++++++++++
xen/include/xen/sched.h | 9 +++++++++
2 files changed, 48 insertions(+)
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index b1c6b6b9fa..6c2231826a 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
kill_timer(&d->watchdog_timer[i].timer);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+void watchdog_domain_suspend(struct domain *d)
+{
+ unsigned int i;
+
+ spin_lock(&d->watchdog_lock);
+
+ for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
+ {
+ if ( test_bit(i, &d->watchdog_inuse_map) )
+ {
+ stop_timer(&d->watchdog_timer[i].timer);
+ }
+ }
+
+ spin_unlock(&d->watchdog_lock);
+}
+
+void watchdog_domain_resume(struct domain *d)
+{
+ unsigned int i;
+
+ spin_lock(&d->watchdog_lock);
+
+ for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
+ {
+ if ( test_bit(i, &d->watchdog_inuse_map) )
+ {
+ set_timer(&d->watchdog_timer[i].timer,
+ NOW() + SECONDS(d->watchdog_timer[i].timeout));
+ }
+ }
+
+ spin_unlock(&d->watchdog_lock);
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/*
* Pin a vcpu temporarily to a specific CPU (or restore old pinning state if
* cpu is NR_CPUS).
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d0d10612ce..caab4aad93 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1109,6 +1109,15 @@ void scheduler_disable(void);
void watchdog_domain_init(struct domain *d);
void watchdog_domain_destroy(struct domain *d);
+#ifdef CONFIG_SYSTEM_SUSPEND
+/*
+ * Suspend/resume watchdogs of domain (while the domain is suspended its
+ * watchdogs should be on pause)
+ */
+void watchdog_domain_suspend(struct domain *d);
+void watchdog_domain_resume(struct domain *d);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/*
* Use this check when the following are both true:
* - Using this feature or interface requires full access to the hardware
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-05 9:11 ` [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers Mykola Kvach
@ 2025-03-11 21:04 ` Julien Grall
2025-05-26 13:03 ` Mykola Kvach
2025-03-13 15:34 ` Jan Beulich
2025-03-20 11:25 ` Grygorii Strashko
2 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 21:04 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Roger Pau Monné, Stefano Stabellini, Mirela Simonovic,
Saeed Nowshadi, Mykyta Poturai
Hi Mykola,
On 05/03/2025 09:11, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> This patch implements suspend/resume helpers for the watchdog.
> While a domain is suspended its watchdogs must be paused. Otherwise,
> if the domain stays in the suspend state for a longer period of time
> compared to the watchdog period, the domain would be shutdown on resume.
> Proper solution to this problem is to stop (suspend) the watchdog timers
> after the domain suspends and to restart (resume) the watchdog timers
> before the domain resumes. The suspend/resume of watchdog timers is done
> in Xen and is invisible to the guests.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in v3:
> - cover the code with CONFIG_SYSTEM_SUSPEND
>
> Changes in v2:
> - drop suspended field from timer structure
> - drop the call of watchdog_domain_resume from ctxt_switch_to
> ---
> xen/common/sched/core.c | 39 +++++++++++++++++++++++++++++++++++++++
> xen/include/xen/sched.h | 9 +++++++++
> 2 files changed, 48 insertions(+)
>
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index b1c6b6b9fa..6c2231826a 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
> kill_timer(&d->watchdog_timer[i].timer);
> }
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
The config option is Arm specific, yet this is common code. As x86,
already suspend/resume, then shouldn't the config option be common?
But more importantly, why do we need to save/restore the watchdogs for
Arm but not x86? Is this a latent issue or design choice?
> +
> +void watchdog_domain_suspend(struct domain *d)
> +{
> + unsigned int i;
> +
> + spin_lock(&d->watchdog_lock);
> +
> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> + {
> + if ( test_bit(i, &d->watchdog_inuse_map) )
> + {
> + stop_timer(&d->watchdog_timer[i].timer);
> + }
> + }
> +
> + spin_unlock(&d->watchdog_lock);
> +}
> +
> +void watchdog_domain_resume(struct domain *d)
> +{
> + unsigned int i;
> +
> + spin_lock(&d->watchdog_lock);
> +
> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> + {
> + if ( test_bit(i, &d->watchdog_inuse_map) )
> + {
> + set_timer(&d->watchdog_timer[i].timer,
> + NOW() + SECONDS(d->watchdog_timer[i].timeout));
> + }
> + }
> +
> + spin_unlock(&d->watchdog_lock);
> +}
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
> /*
> * Pin a vcpu temporarily to a specific CPU (or restore old pinning state if
> * cpu is NR_CPUS).
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index d0d10612ce..caab4aad93 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1109,6 +1109,15 @@ void scheduler_disable(void);
> void watchdog_domain_init(struct domain *d);
> void watchdog_domain_destroy(struct domain *d);
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +/*
> + * Suspend/resume watchdogs of domain (while the domain is suspended its
> + * watchdogs should be on pause)
> + */
> +void watchdog_domain_suspend(struct domain *d);
> +void watchdog_domain_resume(struct domain *d);
> +#endif /* CONFIG_SYSTEM_SUSPEND */
> +
> /*
> * Use this check when the following are both true:
> * - Using this feature or interface requires full access to the hardware
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-11 21:04 ` Julien Grall
@ 2025-05-26 13:03 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-05-26 13:03 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, Mykola Kvach, Dario Faggioli, Juergen Gross,
George Dunlap, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Roger Pau Monné, Stefano Stabellini,
Mirela Simonovic, Saeed Nowshadi, Mykyta Poturai
[-- Attachment #1: Type: text/plain, Size: 7649 bytes --]
Hi, @Julien Grall
On Tue, Mar 11, 2025 at 11:04 PM Julien Grall <julien@xen.org> wrote:
>
> Hi Mykola,
>
> On 05/03/2025 09:11, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > This patch implements suspend/resume helpers for the watchdog.
> > While a domain is suspended its watchdogs must be paused. Otherwise,
> > if the domain stays in the suspend state for a longer period of time
> > compared to the watchdog period, the domain would be shutdown on resume.
> > Proper solution to this problem is to stop (suspend) the watchdog timers
> > after the domain suspends and to restart (resume) the watchdog timers
> > before the domain resumes. The suspend/resume of watchdog timers is done
> > in Xen and is invisible to the guests.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in v3:
> > - cover the code with CONFIG_SYSTEM_SUSPEND
> >
> > Changes in v2:
> > - drop suspended field from timer structure
> > - drop the call of watchdog_domain_resume from ctxt_switch_to
> > ---
> > xen/common/sched/core.c | 39 +++++++++++++++++++++++++++++++++++++++
> > xen/include/xen/sched.h | 9 +++++++++
> > 2 files changed, 48 insertions(+)
> >
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index b1c6b6b9fa..6c2231826a 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
> > kill_timer(&d->watchdog_timer[i].timer);
> > }
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
>
> The config option is Arm specific, yet this is common code. As x86,
> already suspend/resume, then shouldn't the config option be common?
>
> But more importantly, why do we need to save/restore the watchdogs for
> Arm but not x86? Is this a latent issue or design choice?
I’ve looked into this a bit. Here's what I've found:
A watchdog timer is created and initialized (but not started) for each
domain during the domain_create call. This timer can be triggered either by
the Linux kernel (I refer to Linux kernel–based operating systems because I
have access to the code and can confirm that the Xen watchdog timer is
implemented there. I don’t have knowledge about the existence or
implementation of the Xen watchdog driver in other operating systems) Xen
watchdog driver or by the xenwatchdogd service.
Case 1: Watchdog started by the Linux kernel driver (I hope that operating
systems not based on the Linux kernel also implement proper handling for
the Xen watchdog timer driver during suspend and resume)
When the Xen watchdog is started by the Linux kernel driver, everything
works as expected. The driver correctly handles system suspend events:
https://elixir.bootlin.com/linux/v6.14.8/source/drivers/watchdog/xen_wdt.c#L169
Case 2: Watchdog started by xenwatchdogd service
However, when the watchdog is started by the xenwatchdogd service, neither
the underlying OS nor the daemon takes care of stopping the watchdog timer
during suspend:
https://elixir.bootlin.com/xen/v4.20.0/source/tools/hotplug/Linux/init.d/xen-watchdog.in
https://elixir.bootlin.com/xen/v4.20.0/source/tools/hotplug/NetBSD/rc.d/xen-watchdog
Behavior on x86 during suspend:
- Linux guest is configured with xenwatchdogd, and the Xen watchdog is
started at boot
- the OS initiates suspend (we request)
- at that moment, there's an active watchdog timer in Xen for the domain,
set to, say, 15 seconds
- after suspend preparations, domain_shutdown() is called with the
SHUTDOWN_suspend argument inside Xen hypervisor internals
- inside this function, the is_shutting_down flag is set in the domain
structure
- when the watchdog timer expires, the Xen handler skips the reset action
because the domain is marked as shutting down:
https://elixir.bootlin.com/xen/v4.20.0/source/xen/common/sched/core.c#L1539
So far, everything behaves correctly.
*BUT* *for the second case there is another flow. The domain starts
resuming from suspend. As part of the resume process, the is_shutting_down
flag inside the domain is cleared, which re-enables normal watchdog
behavior. However, the watchdog timer—set before suspend—has nearly
expired. Because the OS and its user-space services (such as the watchdog
pinging daemon) have not yet fully resumed and restarted, the watchdog
timeout occurs before the ping can be sent. As a result, the watchdog
triggers a reset or shutdown (as far as i know can't be another action of
watchdog expiry, but we aren't interested in these options right now)
before the service has a chance to take control again — effectively making
a clean resume impossible.*
It's also unclear how common this situation is on x86 systems —
specifically, whether xenwatchdogd is typically used in domU or dom0, or
whether the kernel driver is more commonly relied upon instead.
---
In my opinion, since the guest OS is the one starting the Xen watchdog, it
should also manage its state during suspend/resume transitions. The general
expectation across architectures is that the OS handles device state
management when transitioning between power states. This includes stopping
or parking watchdog timers during suspend.
I think proper handling should be added to the relevant services to avoid
unexpected triggers.
What’s your take on this?
>
> > +
> > +void watchdog_domain_suspend(struct domain *d)
> > +{
> > + unsigned int i;
> > +
> > + spin_lock(&d->watchdog_lock);
> > +
> > + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > + {
> > + if ( test_bit(i, &d->watchdog_inuse_map) )
> > + {
> > + stop_timer(&d->watchdog_timer[i].timer);
> > + }
> > + }
> > +
> > + spin_unlock(&d->watchdog_lock);
> > +}
> > +
> > +void watchdog_domain_resume(struct domain *d)
> > +{
> > + unsigned int i;
> > +
> > + spin_lock(&d->watchdog_lock);
> > +
> > + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > + {
> > + if ( test_bit(i, &d->watchdog_inuse_map) )
> > + {
> > + set_timer(&d->watchdog_timer[i].timer,
> > + NOW() + SECONDS(d->watchdog_timer[i].timeout));
> > + }
> > + }
> > +
> > + spin_unlock(&d->watchdog_lock);
> > +}
> > +
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
> > +
> > /*
> > * Pin a vcpu temporarily to a specific CPU (or restore old pinning
state if
> > * cpu is NR_CPUS).
> > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > index d0d10612ce..caab4aad93 100644
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -1109,6 +1109,15 @@ void scheduler_disable(void);
> > void watchdog_domain_init(struct domain *d);
> > void watchdog_domain_destroy(struct domain *d);
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +/*
> > + * Suspend/resume watchdogs of domain (while the domain is suspended
its
> > + * watchdogs should be on pause)
> > + */
> > +void watchdog_domain_suspend(struct domain *d);
> > +void watchdog_domain_resume(struct domain *d);
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
> > +
> > /*
> > * Use this check when the following are both true:
> > * - Using this feature or interface requires full access to the
hardware
>
> Cheers,
>
> --
> Julien Grall
>
Kind regards,
Mykola
[-- Attachment #2: Type: text/html, Size: 9541 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-05 9:11 ` [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers Mykola Kvach
2025-03-11 21:04 ` Julien Grall
@ 2025-03-13 15:34 ` Jan Beulich
2025-03-21 9:50 ` Mykola Kvach
2025-03-20 11:25 ` Grygorii Strashko
2 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-13 15:34 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Mirela Simonovic,
Saeed Nowshadi, Mykyta Poturai, xen-devel
On 05.03.2025 10:11, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> This patch implements suspend/resume helpers for the watchdog.
> While a domain is suspended its watchdogs must be paused. Otherwise,
> if the domain stays in the suspend state for a longer period of time
> compared to the watchdog period, the domain would be shutdown on resume.
> Proper solution to this problem is to stop (suspend) the watchdog timers
> after the domain suspends and to restart (resume) the watchdog timers
> before the domain resumes. The suspend/resume of watchdog timers is done
> in Xen and is invisible to the guests.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
From: != first S-o-b: is always raising the question of who's the original
author of a patch.
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
> kill_timer(&d->watchdog_timer[i].timer);
> }
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +void watchdog_domain_suspend(struct domain *d)
> +{
> + unsigned int i;
> +
> + spin_lock(&d->watchdog_lock);
> +
> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> + {
> + if ( test_bit(i, &d->watchdog_inuse_map) )
> + {
> + stop_timer(&d->watchdog_timer[i].timer);
> + }
We generally prefer to omit the braces if the body of an if() (or
whatever it is) is a single statement / line.
> + }
> +
> + spin_unlock(&d->watchdog_lock);
> +}
> +
> +void watchdog_domain_resume(struct domain *d)
> +{
> + unsigned int i;
> +
> + spin_lock(&d->watchdog_lock);
> +
> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> + {
> + if ( test_bit(i, &d->watchdog_inuse_map) )
> + {
> + set_timer(&d->watchdog_timer[i].timer,
> + NOW() + SECONDS(d->watchdog_timer[i].timeout));
The timeout may have almost expired before suspending; restoring to the
full original period feels wrong. At the very least, if that's indeed
intended behavior, imo this needs spelling out explicitly.
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1109,6 +1109,15 @@ void scheduler_disable(void);
> void watchdog_domain_init(struct domain *d);
> void watchdog_domain_destroy(struct domain *d);
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +/*
> + * Suspend/resume watchdogs of domain (while the domain is suspended its
> + * watchdogs should be on pause)
> + */
> +void watchdog_domain_suspend(struct domain *d);
> +void watchdog_domain_resume(struct domain *d);
> +#endif /* CONFIG_SYSTEM_SUSPEND */
I don't think the #ifdef is strictly needed here?
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-13 15:34 ` Jan Beulich
@ 2025-03-21 9:50 ` Mykola Kvach
2025-03-21 14:04 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:50 UTC (permalink / raw)
To: Jan Beulich
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Mirela Simonovic,
Saeed Nowshadi, Mykyta Poturai, xen-devel
Hi,
On Thu, Mar 13, 2025 at 5:34 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2025 10:11, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > This patch implements suspend/resume helpers for the watchdog.
> > While a domain is suspended its watchdogs must be paused. Otherwise,
> > if the domain stays in the suspend state for a longer period of time
> > compared to the watchdog period, the domain would be shutdown on resume.
> > Proper solution to this problem is to stop (suspend) the watchdog timers
> > after the domain suspends and to restart (resume) the watchdog timers
> > before the domain resumes. The suspend/resume of watchdog timers is done
> > in Xen and is invisible to the guests.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> From: != first S-o-b: is always raising the question of who's the original
> author of a patch.
I'll try to change the From field if possible. Thank you for pointing that out.
>
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
> > kill_timer(&d->watchdog_timer[i].timer);
> > }
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +void watchdog_domain_suspend(struct domain *d)
> > +{
> > + unsigned int i;
> > +
> > + spin_lock(&d->watchdog_lock);
> > +
> > + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > + {
> > + if ( test_bit(i, &d->watchdog_inuse_map) )
> > + {
> > + stop_timer(&d->watchdog_timer[i].timer);
> > + }
>
> We generally prefer to omit the braces if the body of an if() (or
> whatever it is) is a single statement / line.
will change
>
> > + }
> > +
> > + spin_unlock(&d->watchdog_lock);
> > +}
> > +
> > +void watchdog_domain_resume(struct domain *d)
> > +{
> > + unsigned int i;
> > +
> > + spin_lock(&d->watchdog_lock);
> > +
> > + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > + {
> > + if ( test_bit(i, &d->watchdog_inuse_map) )
> > + {
> > + set_timer(&d->watchdog_timer[i].timer,
> > + NOW() + SECONDS(d->watchdog_timer[i].timeout));
>
> The timeout may have almost expired before suspending; restoring to the
> full original period feels wrong. At the very least, if that's indeed
> intended behavior, imo this needs spelling out explicitly.
It takes some time to wake up the domain, but the watchdog timeout is
reset by a userspace daemon. As a result, we can easily encounter a
watchdog trigger during the resume process. It looks like we should
stop the watchdog timer from the guest, and in that case, we can drop
all changes related to the watchdog in this patch series.
In any case, in this patch, we restore the default timeout instead of
using the real remaining time, so the behavior won't change. However,
I'm not sure exactly how much effort this would require. We can
enable/disable the watchdog using the Linux kernel driver and the Xen
watchdog daemon, but the Linux kernel already handles suspend/resume
of the Xen watchdog timer.
>
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -1109,6 +1109,15 @@ void scheduler_disable(void);
> > void watchdog_domain_init(struct domain *d);
> > void watchdog_domain_destroy(struct domain *d);
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +/*
> > + * Suspend/resume watchdogs of domain (while the domain is suspended its
> > + * watchdogs should be on pause)
> > + */
> > +void watchdog_domain_suspend(struct domain *d);
> > +void watchdog_domain_resume(struct domain *d);
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
>
> I don't think the #ifdef is strictly needed here?
ok, I'll drop it
>
> Jan
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-21 9:50 ` Mykola Kvach
@ 2025-03-21 14:04 ` Jan Beulich
2025-03-24 11:00 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2025-03-21 14:04 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Mirela Simonovic,
Saeed Nowshadi, Mykyta Poturai, xen-devel
On 21.03.2025 10:50, Mykola Kvach wrote:
> On Thu, Mar 13, 2025 at 5:34 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>> +void watchdog_domain_resume(struct domain *d)
>>> +{
>>> + unsigned int i;
>>> +
>>> + spin_lock(&d->watchdog_lock);
>>> +
>>> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
>>> + {
>>> + if ( test_bit(i, &d->watchdog_inuse_map) )
>>> + {
>>> + set_timer(&d->watchdog_timer[i].timer,
>>> + NOW() + SECONDS(d->watchdog_timer[i].timeout));
>>
>> The timeout may have almost expired before suspending; restoring to the
>> full original period feels wrong. At the very least, if that's indeed
>> intended behavior, imo this needs spelling out explicitly.
>
> It takes some time to wake up the domain, but the watchdog timeout is
> reset by a userspace daemon. As a result, we can easily encounter a
> watchdog trigger during the resume process.
Which may mean the restoring is done too early, or needs doing in two
phases.
> It looks like we should
> stop the watchdog timer from the guest, and in that case, we can drop
> all changes related to the watchdog in this patch series.
Except that then you require a guest to be aware of host suspend. Which
may not be desirable.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-21 14:04 ` Jan Beulich
@ 2025-03-24 11:00 ` Mykola Kvach
2025-03-24 11:13 ` Jan Beulich
0 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-24 11:00 UTC (permalink / raw)
To: Jan Beulich
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Mirela Simonovic,
Saeed Nowshadi, Mykyta Poturai, xen-devel
On Fri, Mar 21, 2025 at 4:04 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2025 10:50, Mykola Kvach wrote:
> > On Thu, Mar 13, 2025 at 5:34 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 05.03.2025 10:11, Mykola Kvach wrote:
> >>> +void watchdog_domain_resume(struct domain *d)
> >>> +{
> >>> + unsigned int i;
> >>> +
> >>> + spin_lock(&d->watchdog_lock);
> >>> +
> >>> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> >>> + {
> >>> + if ( test_bit(i, &d->watchdog_inuse_map) )
> >>> + {
> >>> + set_timer(&d->watchdog_timer[i].timer,
> >>> + NOW() + SECONDS(d->watchdog_timer[i].timeout));
> >>
> >> The timeout may have almost expired before suspending; restoring to the
> >> full original period feels wrong. At the very least, if that's indeed
> >> intended behavior, imo this needs spelling out explicitly.
> >
> > It takes some time to wake up the domain, but the watchdog timeout is
> > reset by a userspace daemon. As a result, we can easily encounter a
> > watchdog trigger during the resume process.
>
> Which may mean the restoring is done too early, or needs doing in two
> phases.
>
> > It looks like we should
> > stop the watchdog timer from the guest, and in that case, we can drop
> > all changes related to the watchdog in this patch series.
>
> Except that then you require a guest to be aware of host suspend. Which
> may not be desirable.
I think this is not a problem; at least, I don't see how the guest
could be aware of the host suspend.
For now, we have three cases:
1) The guest is suspended (actually paused) because the system
suspends, and we pause all non-hardware domains.
2) The guest is suspended via the `xl` tool (x86 only, at least for now).
3) The guest requests S2R via `echo mem > /sys/power/state` or
`systemctl suspend`.
Let's review all these cases:
1) There is no action required here; it should be handled correctly by
domain pause. However, I think it is not handled properly right
now—but that is not an issue with the current patch series.
2) There are potential problems here. We should either notify the
domain that it will be suspended (which is hard to implement and the
guest will be aware of the host suspending) or suspend watchdog
directly during the execution of `xl` commands (more preferable).
3) As far as I know, if `watchdogd` is running, we can simply add an
action to it on suspend/resume events (need to review not Linux kernel
cases). In the case of the Linux kernel driver, it already handles
suspending/resuming the Xen watchdog correctly.
So, if I am not mistaken, we can drop all patches related to watchdog
suspend in Xen until `xl suspend/resume` for ARM64 is implemented. For
other cases, we should handle suspend/resume of the watchdog via the
`watchdogd` service.
Note: As far as I know, only the control domain has `watchdogd`
(though we could potentially set it up for other domains). DomUs can
only use the Xen watchdog Linux kernel driver.
>
> Jan
~Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-24 11:00 ` Mykola Kvach
@ 2025-03-24 11:13 ` Jan Beulich
0 siblings, 0 replies; 69+ messages in thread
From: Jan Beulich @ 2025-03-24 11:13 UTC (permalink / raw)
To: Mykola Kvach
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Mirela Simonovic,
Saeed Nowshadi, Mykyta Poturai, xen-devel
On 24.03.2025 12:00, Mykola Kvach wrote:
> On Fri, Mar 21, 2025 at 4:04 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2025 10:50, Mykola Kvach wrote:
>>> On Thu, Mar 13, 2025 at 5:34 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 05.03.2025 10:11, Mykola Kvach wrote:
>>>>> +void watchdog_domain_resume(struct domain *d)
>>>>> +{
>>>>> + unsigned int i;
>>>>> +
>>>>> + spin_lock(&d->watchdog_lock);
>>>>> +
>>>>> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
>>>>> + {
>>>>> + if ( test_bit(i, &d->watchdog_inuse_map) )
>>>>> + {
>>>>> + set_timer(&d->watchdog_timer[i].timer,
>>>>> + NOW() + SECONDS(d->watchdog_timer[i].timeout));
>>>>
>>>> The timeout may have almost expired before suspending; restoring to the
>>>> full original period feels wrong. At the very least, if that's indeed
>>>> intended behavior, imo this needs spelling out explicitly.
>>>
>>> It takes some time to wake up the domain, but the watchdog timeout is
>>> reset by a userspace daemon. As a result, we can easily encounter a
>>> watchdog trigger during the resume process.
>>
>> Which may mean the restoring is done too early, or needs doing in two
>> phases.
>>
>>> It looks like we should
>>> stop the watchdog timer from the guest, and in that case, we can drop
>>> all changes related to the watchdog in this patch series.
Noting this, ...
>> Except that then you require a guest to be aware of host suspend. Which
>> may not be desirable.
>
> I think this is not a problem; at least, I don't see how the guest
> could be aware of the host suspend.
... perhaps it is me who is confused, but: With an unaware guest, how can
the stopping be done from the guest? I.e. ..
> For now, we have three cases:
>
> 1) The guest is suspended (actually paused) because the system
> suspends, and we pause all non-hardware domains.
... in this case in particular, which this series is about aiui.
Jan
> 2) The guest is suspended via the `xl` tool (x86 only, at least for now).
> 3) The guest requests S2R via `echo mem > /sys/power/state` or
> `systemctl suspend`.
>
> Let's review all these cases:
>
> 1) There is no action required here; it should be handled correctly by
> domain pause. However, I think it is not handled properly right
> now—but that is not an issue with the current patch series.
> 2) There are potential problems here. We should either notify the
> domain that it will be suspended (which is hard to implement and the
> guest will be aware of the host suspending) or suspend watchdog
> directly during the execution of `xl` commands (more preferable).
> 3) As far as I know, if `watchdogd` is running, we can simply add an
> action to it on suspend/resume events (need to review not Linux kernel
> cases). In the case of the Linux kernel driver, it already handles
> suspending/resuming the Xen watchdog correctly.
>
> So, if I am not mistaken, we can drop all patches related to watchdog
> suspend in Xen until `xl suspend/resume` for ARM64 is implemented. For
> other cases, we should handle suspend/resume of the watchdog via the
> `watchdogd` service.
>
> Note: As far as I know, only the control domain has `watchdogd`
> (though we could potentially set it up for other domains). DomUs can
> only use the Xen watchdog Linux kernel driver.
>
> ~Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-05 9:11 ` [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers Mykola Kvach
2025-03-11 21:04 ` Julien Grall
2025-03-13 15:34 ` Jan Beulich
@ 2025-03-20 11:25 ` Grygorii Strashko
2025-03-21 9:50 ` Mykola Kvach
2 siblings, 1 reply; 69+ messages in thread
From: Grygorii Strashko @ 2025-03-20 11:25 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mykola Kvach, Dario Faggioli, Juergen Gross, George Dunlap,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Mirela Simonovic, Saeed Nowshadi, Mykyta Poturai
On 05.03.25 11:11, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
>
> This patch implements suspend/resume helpers for the watchdog.
> While a domain is suspended its watchdogs must be paused. Otherwise,
> if the domain stays in the suspend state for a longer period of time
> compared to the watchdog period, the domain would be shutdown on resume.
> Proper solution to this problem is to stop (suspend) the watchdog timers
> after the domain suspends and to restart (resume) the watchdog timers
> before the domain resumes. The suspend/resume of watchdog timers is done
> in Xen and is invisible to the guests.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in v3:
> - cover the code with CONFIG_SYSTEM_SUSPEND
>
> Changes in v2:
> - drop suspended field from timer structure
> - drop the call of watchdog_domain_resume from ctxt_switch_to
> ---
> xen/common/sched/core.c | 39 +++++++++++++++++++++++++++++++++++++++
> xen/include/xen/sched.h | 9 +++++++++
> 2 files changed, 48 insertions(+)
>
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index b1c6b6b9fa..6c2231826a 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
> kill_timer(&d->watchdog_timer[i].timer);
> }
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> +
> +void watchdog_domain_suspend(struct domain *d)
> +{
> + unsigned int i;
> +
> + spin_lock(&d->watchdog_lock);
> +
> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> + {
> + if ( test_bit(i, &d->watchdog_inuse_map) )
> + {
> + stop_timer(&d->watchdog_timer[i].timer);
> + }
> + }
> +
> + spin_unlock(&d->watchdog_lock);
> +}
> +
> +void watchdog_domain_resume(struct domain *d)
> +{
> + unsigned int i;
> +
> + spin_lock(&d->watchdog_lock);
> +
> + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> + {
> + if ( test_bit(i, &d->watchdog_inuse_map) )
> + {
> + set_timer(&d->watchdog_timer[i].timer,
> + NOW() + SECONDS(d->watchdog_timer[i].timeout));
> + }
> + }
> +
> + spin_unlock(&d->watchdog_lock);
> +}
> +
> +#endif /* CONFIG_SYSTEM_SUSPEND */
My understanding is that domain's watchdogs support are not mandatory requirement
for enabling basic System suspend2ram feature, as they are not enabled automatically.
So, domain's watchdog patches can be separated and posted after basic functionality
is in place.
[...]
--
Best regards,
-grygorii
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers
2025-03-20 11:25 ` Grygorii Strashko
@ 2025-03-21 9:50 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-21 9:50 UTC (permalink / raw)
To: Grygorii Strashko
Cc: xen-devel, Mykola Kvach, Dario Faggioli, Juergen Gross,
George Dunlap, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Mirela Simonovic, Saeed Nowshadi,
Mykyta Poturai
Hi,
On Thu, Mar 20, 2025 at 1:25 PM Grygorii Strashko
<grygorii_strashko@epam.com> wrote:
>
>
>
> On 05.03.25 11:11, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > This patch implements suspend/resume helpers for the watchdog.
> > While a domain is suspended its watchdogs must be paused. Otherwise,
> > if the domain stays in the suspend state for a longer period of time
> > compared to the watchdog period, the domain would be shutdown on resume.
> > Proper solution to this problem is to stop (suspend) the watchdog timers
> > after the domain suspends and to restart (resume) the watchdog timers
> > before the domain resumes. The suspend/resume of watchdog timers is done
> > in Xen and is invisible to the guests.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in v3:
> > - cover the code with CONFIG_SYSTEM_SUSPEND
> >
> > Changes in v2:
> > - drop suspended field from timer structure
> > - drop the call of watchdog_domain_resume from ctxt_switch_to
> > ---
> > xen/common/sched/core.c | 39 +++++++++++++++++++++++++++++++++++++++
> > xen/include/xen/sched.h | 9 +++++++++
> > 2 files changed, 48 insertions(+)
> >
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index b1c6b6b9fa..6c2231826a 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -1605,6 +1605,45 @@ void watchdog_domain_destroy(struct domain *d)
> > kill_timer(&d->watchdog_timer[i].timer);
> > }
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > +
> > +void watchdog_domain_suspend(struct domain *d)
> > +{
> > + unsigned int i;
> > +
> > + spin_lock(&d->watchdog_lock);
> > +
> > + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > + {
> > + if ( test_bit(i, &d->watchdog_inuse_map) )
> > + {
> > + stop_timer(&d->watchdog_timer[i].timer);
> > + }
> > + }
> > +
> > + spin_unlock(&d->watchdog_lock);
> > +}
> > +
> > +void watchdog_domain_resume(struct domain *d)
> > +{
> > + unsigned int i;
> > +
> > + spin_lock(&d->watchdog_lock);
> > +
> > + for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> > + {
> > + if ( test_bit(i, &d->watchdog_inuse_map) )
> > + {
> > + set_timer(&d->watchdog_timer[i].timer,
> > + NOW() + SECONDS(d->watchdog_timer[i].timeout));
> > + }
> > + }
> > +
> > + spin_unlock(&d->watchdog_lock);
> > +}
> > +
> > +#endif /* CONFIG_SYSTEM_SUSPEND */
>
> My understanding is that domain's watchdogs support are not mandatory requirement
> for enabling basic System suspend2ram feature, as they are not enabled automatically.
> So, domain's watchdog patches can be separated and posted after basic functionality
> is in place.
AFAIK, the hardware domain always has the watchdog enabled, at least for now.
>
> [...]
>
> --
> Best regards,
> -grygorii
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 09/16] xen/arm: add suspend and resume timer helpers
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (7 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 08/16] xen/arm: add watchdog domain suspend/resume helpers Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 9:11 ` [PATCH 10/16] xen/arm: Implement GIC suspend/resume functions (gicv2 only) Mykola Kvach
` (8 subsequent siblings)
17 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
Timer interrupts have to be disabled while the system is in suspend.
Otherwise, a timer interrupt would fire and wake-up the system.
Suspending the timer interrupts consists of disabling physical EL1
and EL2 timers. The resume consists only of raising timer softirq,
which will trigger the generic timer code to reprogram the EL2 timer
as needed. Enabling of EL1 physical timer will be triggered by an
entity which uses it.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
xen/arch/arm/include/asm/time.h | 5 +++++
xen/arch/arm/time.c | 26 ++++++++++++++++++++++++++
2 files changed, 31 insertions(+)
diff --git a/xen/arch/arm/include/asm/time.h b/xen/arch/arm/include/asm/time.h
index 49ad8c1a6d..f4fd0c6af5 100644
--- a/xen/arch/arm/include/asm/time.h
+++ b/xen/arch/arm/include/asm/time.h
@@ -108,6 +108,11 @@ void preinit_xen_time(void);
void force_update_vcpu_system_time(struct vcpu *v);
+#ifdef CONFIG_SYSTEM_SUSPEND
+void time_suspend(void);
+void time_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
#endif /* __ARM_TIME_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index e74d30d258..89c5773244 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -372,6 +372,32 @@ void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds)
/* XXX update guest visible wallclock time */
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+void time_suspend(void)
+{
+ /* Disable physical EL1 timer */
+ WRITE_SYSREG(0, CNTP_CTL_EL0);
+
+ /* Disable hypervisor's timer */
+ WRITE_SYSREG(0, CNTHP_CTL_EL2);
+ isb();
+}
+
+void time_resume(void)
+{
+ /*
+ * Raising timer softirq will trigger generic timer code to reprogram_timer
+ * with the correct timeout value (which is not known here). There is no
+ * need to do anything else in order to recover the time keeping from power
+ * down, because the system counter is not affected by the power down (it
+ * resides out of the ARM's cluster in an always-on part of the SoC).
+ */
+ raise_softirq(TIMER_SOFTIRQ);
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
static int cpu_time_callback(struct notifier_block *nfb,
unsigned long action,
void *hcpu)
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* [PATCH 10/16] xen/arm: Implement GIC suspend/resume functions (gicv2 only)
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (8 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 09/16] xen/arm: add suspend and resume timer helpers Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 9:11 ` [PATCH 11/16] xen/arm: Implement PSCI system suspend Mykola Kvach
` (7 subsequent siblings)
17 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
System suspend may lead to a state where GIC would be powered down.
Therefore, Xen should save/restore the context of GIC on suspend/resume.
Note that the context consists of states of registers which are
controlled by the hypervisor. Other GIC registers which are accessible
by guests are saved/restored on context switch.
Tested on Xilinx Ultrascale+ MPSoC with (and without) powering down
the GIC.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
changes in v3:
- drop asserts, return error code instead
- cover the code with CONFIG_SYSTEM_SUSPEND
changes in v2:
- minor fixes after review
---
xen/arch/arm/gic-v2.c | 142 +++++++++++++++++++++++++++++++++
xen/arch/arm/gic.c | 29 +++++++
xen/arch/arm/include/asm/gic.h | 12 +++
3 files changed, 183 insertions(+)
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 02043c0d4b..868e1a5026 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1098,6 +1098,139 @@ static int gicv2_iomem_deny_access(struct domain *d)
return iomem_deny_access(d, mfn, mfn + nr);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+/* GICv2 registers to be saved/restored on system suspend/resume */
+struct gicv2_context {
+ /* GICC context */
+ uint32_t gicc_ctlr;
+ uint32_t gicc_pmr;
+ uint32_t gicc_bpr;
+ /* GICD context */
+ uint32_t gicd_ctlr;
+ uint32_t *gicd_isenabler;
+ uint32_t *gicd_isactiver;
+ uint32_t *gicd_ipriorityr;
+ uint32_t *gicd_itargetsr;
+ uint32_t *gicd_icfgr;
+};
+
+static struct gicv2_context gicv2_context;
+
+static int gicv2_suspend(void)
+{
+ unsigned int i;
+
+ if ( !gicv2_context.gicd_isenabler )
+ return -ENOMEM;
+
+ /* Save GICC configuration */
+ gicv2_context.gicc_ctlr = readl_gicc(GICC_CTLR);
+ gicv2_context.gicc_pmr = readl_gicc(GICC_PMR);
+ gicv2_context.gicc_bpr = readl_gicc(GICC_BPR);
+
+ /* Save GICD configuration */
+ gicv2_context.gicd_ctlr = readl_gicd(GICD_CTLR);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
+ gicv2_context.gicd_isenabler[i] = readl_gicd(GICD_ISENABLER + i * 4);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ )
+ gicv2_context.gicd_isactiver[i] = readl_gicd(GICD_ISACTIVER + i * 4);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+ gicv2_context.gicd_ipriorityr[i] = readl_gicd(GICD_IPRIORITYR + i * 4);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+ gicv2_context.gicd_itargetsr[i] = readl_gicd(GICD_ITARGETSR + i * 4);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
+ gicv2_context.gicd_icfgr[i] = readl_gicd(GICD_ICFGR + i * 4);
+
+ return 0;
+}
+
+static void gicv2_resume(void)
+{
+ unsigned int i;
+
+ if ( !gicv2_context.gicd_isenabler )
+ return;
+
+ /* Disable CPU interface and distributor */
+ writel_gicc(0, GICC_CTLR);
+ writel_gicd(0, GICD_CTLR);
+
+ /* Restore GICD configuration */
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
+ writel_gicd(0xffffffff, GICD_ICENABLER + i * 4);
+ writel_gicd(gicv2_context.gicd_isenabler[i], GICD_ISENABLER + i * 4);
+ }
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 32); i++ ) {
+ writel_gicd(0xffffffff, GICD_ICACTIVER + i * 4);
+ writel_gicd(gicv2_context.gicd_isactiver[i], GICD_ISACTIVER + i * 4);
+ }
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+ writel_gicd(gicv2_context.gicd_ipriorityr[i], GICD_IPRIORITYR + i * 4);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 4); i++ )
+ writel_gicd(gicv2_context.gicd_itargetsr[i], GICD_ITARGETSR + i * 4);
+
+ for ( i = 0; i < DIV_ROUND_UP(gicv2_info.nr_lines, 16); i++ )
+ writel_gicd(gicv2_context.gicd_icfgr[i], GICD_ICFGR + i * 4);
+
+ /* Make sure all registers are restored and enable distributor */
+ writel_gicd(gicv2_context.gicd_ctlr | GICD_CTL_ENABLE, GICD_CTLR);
+
+ /* Restore GIC CPU interface configuration */
+ writel_gicc(gicv2_context.gicc_pmr, GICC_PMR);
+ writel_gicc(gicv2_context.gicc_bpr, GICC_BPR);
+
+ /* Enable GIC CPU interface */
+ writel_gicc(gicv2_context.gicc_ctlr | GICC_CTL_ENABLE | GICC_CTL_EOI,
+ GICC_CTLR);
+}
+
+static void gicv2_alloc_context(struct gicv2_context *gc)
+{
+ uint32_t n = gicv2_info.nr_lines;
+
+ gc->gicd_isenabler = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
+ if ( !gc->gicd_isenabler )
+ goto err_free;
+
+ gc->gicd_isactiver = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 32));
+ if ( !gc->gicd_isactiver )
+ goto err_free;
+
+ gc->gicd_itargetsr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
+ if ( !gc->gicd_itargetsr )
+ goto err_free;
+
+ gc->gicd_ipriorityr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 4));
+ if ( !gc->gicd_ipriorityr )
+ goto err_free;
+
+ gc->gicd_icfgr = xzalloc_array(uint32_t, DIV_ROUND_UP(n, 16));
+ if ( !gc->gicd_icfgr )
+ goto err_free;
+
+ return;
+
+ err_free:
+ xfree(gc->gicd_icfgr);
+ xfree(gc->gicd_ipriorityr);
+ xfree(gc->gicd_itargetsr);
+ xfree(gc->gicd_isactiver);
+ xfree(gc->gicd_isenabler);
+
+ memset(gc, 0, sizeof(*gc));
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
#ifdef CONFIG_ACPI
static unsigned long gicv2_get_hwdom_extra_madt_size(const struct domain *d)
{
@@ -1302,6 +1435,11 @@ static int __init gicv2_init(void)
spin_unlock(&gicv2.lock);
+#ifdef CONFIG_SYSTEM_SUSPEND
+ /* Allocate memory to be used for saving GIC context during the suspend */
+ gicv2_alloc_context(&gicv2_context);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
return 0;
}
@@ -1345,6 +1483,10 @@ const static struct gic_hw_operations gicv2_ops = {
.map_hwdom_extra_mappings = gicv2_map_hwdom_extra_mappings,
.iomem_deny_access = gicv2_iomem_deny_access,
.do_LPI = gicv2_do_LPI,
+#ifdef CONFIG_SYSTEM_SUSPEND
+ .suspend = gicv2_suspend,
+ .resume = gicv2_resume,
+#endif /* CONFIG_SYSTEM_SUSPEND */
};
/* Set up the GIC */
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index acf61a4de3..1c974cf0f5 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -422,6 +422,35 @@ int gic_iomem_deny_access(struct domain *d)
return gic_hw_ops->iomem_deny_access(d);
}
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+int gic_suspend(void)
+{
+ /* Must be called by boot CPU#0 with interrupts disabled */
+ ASSERT(!local_irq_is_enabled());
+ ASSERT(!smp_processor_id());
+
+ if ( !gic_hw_ops->suspend || !gic_hw_ops->resume )
+ return -ENOSYS;
+
+ return gic_hw_ops->suspend();
+}
+
+void gic_resume(void)
+{
+ /*
+ * Must be called by boot CPU#0 with interrupts disabled after gic_suspend
+ * has returned successfully.
+ */
+ ASSERT(!local_irq_is_enabled());
+ ASSERT(!smp_processor_id());
+ ASSERT(gic_hw_ops->resume);
+
+ gic_hw_ops->resume();
+}
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
static int cpu_gic_callback(struct notifier_block *nfb,
unsigned long action,
void *hcpu)
diff --git a/xen/arch/arm/include/asm/gic.h b/xen/arch/arm/include/asm/gic.h
index 541f0eeb80..a706303008 100644
--- a/xen/arch/arm/include/asm/gic.h
+++ b/xen/arch/arm/include/asm/gic.h
@@ -280,6 +280,12 @@ extern int gicv_setup(struct domain *d);
extern void gic_save_state(struct vcpu *v);
extern void gic_restore_state(struct vcpu *v);
+#ifdef CONFIG_SYSTEM_SUSPEND
+/* Suspend/resume */
+extern int gic_suspend(void);
+extern void gic_resume(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/* SGI (AKA IPIs) */
enum gic_sgi {
GIC_SGI_EVENT_CHECK,
@@ -395,6 +401,12 @@ struct gic_hw_operations {
int (*iomem_deny_access)(struct domain *d);
/* Handle LPIs, which require special handling */
void (*do_LPI)(unsigned int lpi);
+#ifdef CONFIG_SYSTEM_SUSPEND
+ /* Save GIC configuration due to the system suspend */
+ int (*suspend)(void);
+ /* Restore GIC configuration due to the system resume */
+ void (*resume)(void);
+#endif /* CONFIG_SYSTEM_SUSPEND */
};
extern const struct gic_hw_operations *gic_hw_ops;
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* [PATCH 11/16] xen/arm: Implement PSCI system suspend
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (9 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 10/16] xen/arm: Implement GIC suspend/resume functions (gicv2 only) Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 21:49 ` Julien Grall
2025-03-21 14:46 ` Grygorii Strashko
2025-03-05 9:11 ` [PATCH 12/16] xen/arm: Trigger Xen suspend when hardware domain completes suspend Mykola Kvach
` (6 subsequent siblings)
17 siblings, 2 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
The implementation consists of:
-Adding PSCI system suspend call as new PSCI function
-Trapping PSCI system_suspend HVC
-Implementing PSCI system suspend call (virtual interface that allows
guests to suspend themselves), but currently it is only partially
implemented, so suspend/resume will correctly work only for dom0
The PSCI system suspend should be called by a guest from its boot
VCPU. Non-boot VCPUs of the guest should be hot-unplugged using PSCI
CPU_OFF call prior to issuing PSCI system suspend. Interrupts that
are left enabled by the guest are assumed to be its wake-up interrupts.
Therefore, a wake-up interrupt triggers the resume of the guest. Guest
should resume regardless of the state of Xen (suspended or not).
When a guest calls PSCI system suspend the respective domain will be
suspended if the following conditions are met:
1) Given resume entry point is not invalid
2) Other (if any) VCPUs of the calling guest are hot-unplugged
If the conditions above are met the calling domain is labeled as
suspended and the calling VCPU is blocked. If nothing else wouldn't
be done the suspended domain would resume from the place where it
called PSCI system suspend. This is expected if processing of the PSCI
system suspend call fails. However, in the case of success the calling
guest should resume (continue execution after the wake-up) from the entry
point which is given as the first argument of the PSCI system suspend
call. In addition to the entry point, the guest expects to start within
the environment whose state matches the state after reset. This means
that the guest should find reset register values, MMU disabled, etc.
Thereby, the context of VCPU should be 'reset' (as if the system is
comming out of reset), the program counter should contain entry point,
which is 1st argument, and r0/x0 should contain context ID which is 2nd
argument of PSCI system suspend call. The context of VCPU is set
accordingly when the PSCI system suspend is processed, so that nothing
needs to be done on resume/wake-up path.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V3:
Dropped all domain flags and related code (which touched common functions like
vcpu_unblock), keeping only the necessary changes for Xen suspend/resume, i.e.
suspend/resume is now fully supported only for the hardware domain.
Proper support for domU suspend/resume will be added in a future patch.
This patch does not yet include VCPU context reset or domain context
restoration in VCPU.
---
xen/arch/arm/Makefile | 1 +
xen/arch/arm/include/asm/domain.h | 3 ++
xen/arch/arm/include/asm/perfc_defn.h | 1 +
xen/arch/arm/include/asm/psci.h | 2 +
xen/arch/arm/include/asm/suspend.h | 18 +++++++
xen/arch/arm/suspend.c | 67 +++++++++++++++++++++++++++
xen/arch/arm/vpsci.c | 32 +++++++++++++
7 files changed, 124 insertions(+)
create mode 100644 xen/arch/arm/include/asm/suspend.h
create mode 100644 xen/arch/arm/suspend.c
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 43ab5e8f25..70d4b5daf8 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -53,6 +53,7 @@ obj-y += smpboot.o
obj-$(CONFIG_STATIC_EVTCHN) += static-evtchn.init.o
obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o
obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o
+obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
obj-y += sysctl.o
obj-y += time.o
obj-y += traps.o
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 50b6a4b009..8b1bdf3d74 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -233,6 +233,9 @@ struct arch_vcpu
struct vtimer virt_timer;
bool vtimer_initialized;
+ register_t suspend_ep;
+ register_t suspend_cid;
+
/*
* The full P2M may require some cleaning (e.g when emulation
* set/way). As the action can take a long time, it requires
diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
index 3ab0391175..5049563718 100644
--- a/xen/arch/arm/include/asm/perfc_defn.h
+++ b/xen/arch/arm/include/asm/perfc_defn.h
@@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
PERFCOUNTER(vpsci_features, "vpsci: features")
+PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 4780972621..48a93e6b79 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -47,10 +47,12 @@ void call_psci_system_reset(void);
#define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
#define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
#define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
+#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
#define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
#define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
#define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
+#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
/* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
#define PSCI_0_2_AFFINITY_LEVEL_ON 0
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
new file mode 100644
index 0000000000..745377dbcf
--- /dev/null
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_ARM_SUSPEND_H__
+#define __ASM_ARM_SUSPEND_H__
+
+int32_t domain_suspend(register_t epoint, register_t cid);
+
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
new file mode 100644
index 0000000000..27fab8c999
--- /dev/null
+++ b/xen/arch/arm/suspend.c
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/sched.h>
+#include <asm/cpufeature.h>
+#include <asm/event.h>
+#include <asm/psci.h>
+#include <asm/suspend.h>
+#include <asm/platform.h>
+#include <public/sched.h>
+
+static void vcpu_suspend_prepare(register_t epoint, register_t cid)
+{
+ struct vcpu *v = current;
+
+ v->arch.suspend_ep = epoint;
+ v->arch.suspend_cid = cid;
+}
+
+int32_t domain_suspend(register_t epoint, register_t cid)
+{
+ struct vcpu *v;
+ struct domain *d = current->domain;
+ bool is_thumb = epoint & 1;
+
+ dprintk(XENLOG_DEBUG,
+ "Dom%d suspend: epoint=0x%"PRIregister", cid=0x%"PRIregister"\n",
+ d->domain_id, epoint, cid);
+
+ /* THUMB set is not allowed with 64-bit domain */
+ if ( is_64bit_domain(d) && is_thumb )
+ return PSCI_INVALID_ADDRESS;
+
+ /* TODO: care about locking here */
+ /* Ensure that all CPUs other than the calling one are offline */
+ for_each_vcpu ( d, v )
+ {
+ if ( v != current && is_vcpu_online(v) )
+ return PSCI_DENIED;
+ }
+
+ /*
+ * Prepare the calling VCPU for suspend (save entry point into pc and
+ * context ID into r0/x0 as specified by PSCI SYSTEM_SUSPEND)
+ */
+ vcpu_suspend_prepare(epoint, cid);
+
+ /* Disable watchdogs of this domain */
+ watchdog_domain_suspend(d);
+
+ /*
+ * The calling domain is suspended by blocking its last running VCPU. If an
+ * event is pending the domain will resume right away (VCPU will not block,
+ * but when scheduled in it will resume from the given entry point).
+ */
+ vcpu_block_unless_event_pending(current);
+
+ return PSCI_SUCCESS;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index d1615be8a6..96eef06c18 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -7,6 +7,7 @@
#include <asm/vgic.h>
#include <asm/vpsci.h>
#include <asm/event.h>
+#include <asm/suspend.h>
#include <public/sched.h>
@@ -197,6 +198,15 @@ static void do_psci_0_2_system_reset(void)
domain_shutdown(d,SHUTDOWN_reboot);
}
+static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
+{
+#ifdef CONFIG_SYSTEM_SUSPEND
+ return domain_suspend(epoint, cid);
+#else
+ return PSCI_NOT_SUPPORTED;
+#endif
+}
+
static int32_t do_psci_1_0_features(uint32_t psci_func_id)
{
/* /!\ Ordered by function ID and not name */
@@ -214,6 +224,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
case PSCI_0_2_FN32_SYSTEM_OFF:
case PSCI_0_2_FN32_SYSTEM_RESET:
case PSCI_1_0_FN32_PSCI_FEATURES:
+ case PSCI_1_0_FN32_SYSTEM_SUSPEND:
+ case PSCI_1_0_FN64_SYSTEM_SUSPEND:
case ARM_SMCCC_VERSION_FID:
return 0;
default:
@@ -344,6 +356,26 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
return true;
}
+ case PSCI_1_0_FN32_SYSTEM_SUSPEND:
+ case PSCI_1_0_FN64_SYSTEM_SUSPEND:
+ {
+ register_t epoint = PSCI_ARG(regs,1);
+ register_t cid = PSCI_ARG(regs,2);
+ register_t ret;
+
+ perfc_incr(vpsci_system_suspend);
+ /* Set the result to PSCI_SUCCESS if the call fails.
+ * Otherwise preserve the context_id in x0. For now
+ * we don't support the case where the system is suspended
+ * to a shallower level and PSCI_SUCCESS is returned to the
+ * caller.
+ */
+ ret = do_psci_1_0_system_suspend(epoint, cid);
+ if ( ret != PSCI_SUCCESS )
+ PSCI_SET_RESULT(regs, ret);
+ return true;
+ }
+
default:
return false;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 11/16] xen/arm: Implement PSCI system suspend
2025-03-05 9:11 ` [PATCH 11/16] xen/arm: Implement PSCI system suspend Mykola Kvach
@ 2025-03-11 21:49 ` Julien Grall
2025-03-21 14:46 ` Grygorii Strashko
1 sibling, 0 replies; 69+ messages in thread
From: Julien Grall @ 2025-03-11 21:49 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi, Mykyta Poturai,
Mykola Kvach
Hi,
On 05/03/2025 09:11, Mykola Kvach wrote:
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> The implementation consists of:
> -Adding PSCI system suspend call as new PSCI function
> -Trapping PSCI system_suspend HVC
> -Implementing PSCI system suspend call (virtual interface that allows
> guests to suspend themselves), but currently it is only partially
> implemented, so suspend/resume will correctly work only for dom0
What is missing for other domains?
>
> The PSCI system suspend should be called by a guest from its boot
> VCPU. Non-boot VCPUs of the guest should be hot-unplugged using PSCI
> CPU_OFF call prior to issuing PSCI system suspend. Interrupts that
> are left enabled by the guest are assumed to be its wake-up interrupts.
> Therefore, a wake-up interrupt triggers the resume of the guest. Guest
> should resume regardless of the state of Xen (suspended or not).
>
> When a guest calls PSCI system suspend the respective domain will be
> suspended if the following conditions are met:
> 1) Given resume entry point is not invalid
> 2) Other (if any) VCPUs of the calling guest are hot-unplugged
>
> If the conditions above are met the calling domain is labeled as
> suspended and the calling VCPU is blocked. If nothing else wouldn't
> be done the suspended domain would resume from the place where it
> called PSCI system suspend. This is expected if processing of the PSCI
> system suspend call fails. However, in the case of success the calling
> guest should resume (continue execution after the wake-up) from the entry
> point which is given as the first argument of the PSCI system suspend
> call. In addition to the entry point, the guest expects to start within
> the environment whose state matches the state after reset. This means
> that the guest should find reset register values, MMU disabled, etc.
> Thereby, the context of VCPU should be 'reset' (as if the system is
> comming out of reset), the program counter should contain entry point,
> which is 1st argument, and r0/x0 should contain context ID which is 2nd
> argument of PSCI system suspend call. The context of VCPU is set
> accordingly when the PSCI system suspend is processed, so that nothing
> needs to be done on resume/wake-up path.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V3:
> Dropped all domain flags and related code (which touched common functions like
> vcpu_unblock), keeping only the necessary changes for Xen suspend/resume, i.e.
> suspend/resume is now fully supported only for the hardware domain.
> Proper support for domU suspend/resume will be added in a future patch.
> This patch does not yet include VCPU context reset or domain context
> restoration in VCPU.
> ---
> xen/arch/arm/Makefile | 1 +
> xen/arch/arm/include/asm/domain.h | 3 ++
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 2 +
> xen/arch/arm/include/asm/suspend.h | 18 +++++++
> xen/arch/arm/suspend.c | 67 +++++++++++++++++++++++++++
> xen/arch/arm/vpsci.c | 32 +++++++++++++
> 7 files changed, 124 insertions(+)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/arch/arm/suspend.c
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 43ab5e8f25..70d4b5daf8 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -53,6 +53,7 @@ obj-y += smpboot.o
> obj-$(CONFIG_STATIC_EVTCHN) += static-evtchn.init.o
> obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o
> obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o
> +obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
I am a bit confused why we are tie guest suspend/resume with system
suspend/resume. Shouldn't they be separate?
> obj-y += sysctl.o
> obj-y += time.o
> obj-y += traps.o
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 50b6a4b009..8b1bdf3d74 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -233,6 +233,9 @@ struct arch_vcpu
> struct vtimer virt_timer;
> bool vtimer_initialized;
>
> + register_t suspend_ep;
> + register_t suspend_cid;
> +
> /*
> * The full P2M may require some cleaning (e.g when emulation
> * set/way). As the action can take a long time, it requires
> diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> index 3ab0391175..5049563718 100644
> --- a/xen/arch/arm/include/asm/perfc_defn.h
> +++ b/xen/arch/arm/include/asm/perfc_defn.h
> @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> PERFCOUNTER(vpsci_features, "vpsci: features")
> +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>
> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>
> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> index 4780972621..48a93e6b79 100644
> --- a/xen/arch/arm/include/asm/psci.h
> +++ b/xen/arch/arm/include/asm/psci.h
> @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>
> #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>
> /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> new file mode 100644
> index 0000000000..745377dbcf
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef __ASM_ARM_SUSPEND_H__
> +#define __ASM_ARM_SUSPEND_H__
> +
> +int32_t domain_suspend(register_t epoint, register_t cid);
> +
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> new file mode 100644
> index 0000000000..27fab8c999
> --- /dev/null
> +++ b/xen/arch/arm/suspend.c
> @@ -0,0 +1,67 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/sched.h>
> +#include <asm/cpufeature.h>
> +#include <asm/event.h>
> +#include <asm/psci.h>
> +#include <asm/suspend.h>
> +#include <asm/platform.h>
> +#include <public/sched.h>
> +
> +static void vcpu_suspend_prepare(register_t epoint, register_t cid)
> +{
> + struct vcpu *v = current;
> +
> + v->arch.suspend_ep = epoint;
> + v->arch.suspend_cid = cid;
> +}
> +
> +int32_t domain_suspend(register_t epoint, register_t cid)
> +{
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + bool is_thumb = epoint & 1;
> +
> + dprintk(XENLOG_DEBUG,
> + "Dom%d suspend: epoint=0x%"PRIregister", cid=0x%"PRIregister"\n",
> + d->domain_id, epoint, cid);
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + /* TODO: care about locking here */
What's the plan for this?
> + /* Ensure that all CPUs other than the calling one are offline */
> + for_each_vcpu ( d, v )
> + {
> + if ( v != current && is_vcpu_online(v) )
> + return PSCI_DENIED;
> + }
> +
> + /*
> + * Prepare the calling VCPU for suspend (save entry point into pc and
> + * context ID into r0/x0 as specified by PSCI SYSTEM_SUSPEND)
> + */
Looking at the code, it doesn't save the value into pc and x0/r0.
Instead, it is stashing it into different fields.
> + vcpu_suspend_prepare(epoint, cid);
> +
> + /* Disable watchdogs of this domain */
> + watchdog_domain_suspend(d);
At least for guests, I was expecting psci_suspend to behave very
similarly to domain_shutdown(SHUTDOWN_suspend) so it can be used with
"xl suspend/resume". Is there any reason we are diverging?
> +
> + /*
> + * The calling domain is suspended by blocking its last running VCPU. If an
> + * event is pending the domain will resume right away (VCPU will not block,
> + * but when scheduled in it will resume from the given entry point).
> + */
Looking at the code, you don't seem to set x0, pc or even reset the vCPU
unless the platform suspend. So are you sure the suspend will work
properly if there is an event pending?
> + vcpu_block_unless_event_pending(current);
> +> + return PSCI_SUCCESS;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> index d1615be8a6..96eef06c18 100644
> --- a/xen/arch/arm/vpsci.c
> +++ b/xen/arch/arm/vpsci.c
> @@ -7,6 +7,7 @@
> #include <asm/vgic.h>
> #include <asm/vpsci.h>
> #include <asm/event.h>
> +#include <asm/suspend.h>
>
> #include <public/sched.h>
>
> @@ -197,6 +198,15 @@ static void do_psci_0_2_system_reset(void)
> domain_shutdown(d,SHUTDOWN_reboot);
> }
>
> +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> +{
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + return domain_suspend(epoint, cid);
> +#else
> + return PSCI_NOT_SUPPORTED;
> +#endif
> +}
> +
> static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> {
> /* /!\ Ordered by function ID and not name */
> @@ -214,6 +224,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> case PSCI_0_2_FN32_SYSTEM_OFF:
> case PSCI_0_2_FN32_SYSTEM_RESET:
> case PSCI_1_0_FN32_PSCI_FEATURES:
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> case ARM_SMCCC_VERSION_FID:
> return 0;
> default:
> @@ -344,6 +356,26 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
> return true;
> }
>
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> + {
> + register_t epoint = PSCI_ARG(regs,1);
> + register_t cid = PSCI_ARG(regs,2);
> + register_t ret;
> +
> + perfc_incr(vpsci_system_suspend);
> + /* Set the result to PSCI_SUCCESS if the call fails.
I don't understand the comment. Below, you will set "ret" only if the
call doesn't return SUCCESS.
Also, coding style:
/*
* Foo ...
* Bar ...
*/
> + * Otherwise preserve the context_id in x0. For now
Per above, I don't think this is correct.
> + * we don't support the case where the system is suspended
> + * to a shallower level and PSCI_SUCCESS is returned to the
> + * caller.
I am confused. Per the specification, PSCI_SYSTEM_SUSPEND cannot return
PSCI_SUCCESS if the call is successful. Any chance you are confusing
with CPU_SUSPEND?
> + */
> + ret = do_psci_1_0_system_suspend(epoint, cid);
> + if ( ret != PSCI_SUCCESS )
> + PSCI_SET_RESULT(regs, ret);
> + return true;
> + }
> +
> default:
> return false;
> }
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 11/16] xen/arm: Implement PSCI system suspend
2025-03-05 9:11 ` [PATCH 11/16] xen/arm: Implement PSCI system suspend Mykola Kvach
2025-03-11 21:49 ` Julien Grall
@ 2025-03-21 14:46 ` Grygorii Strashko
1 sibling, 0 replies; 69+ messages in thread
From: Grygorii Strashko @ 2025-03-21 14:46 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Saeed Nowshadi, Mykyta Poturai, Mykola Kvach
Hi Mykola,
On 05.03.25 11:11, Mykola Kvach wrote:
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> The implementation consists of:
> -Adding PSCI system suspend call as new PSCI function
> -Trapping PSCI system_suspend HVC
> -Implementing PSCI system suspend call (virtual interface that allows
> guests to suspend themselves), but currently it is only partially
> implemented, so suspend/resume will correctly work only for dom0
>
> The PSCI system suspend should be called by a guest from its boot
> VCPU. Non-boot VCPUs of the guest should be hot-unplugged using PSCI
> CPU_OFF call prior to issuing PSCI system suspend. Interrupts that
> are left enabled by the guest are assumed to be its wake-up interrupts.
> Therefore, a wake-up interrupt triggers the resume of the guest.
> Guest
> should resume regardless of the state of Xen (suspended or not).
This is strange statement - nothing can be resumed unles Xen itself is
resumed.
>
> When a guest calls PSCI system suspend the respective domain will be
> suspended if the following conditions are met:
> 1) Given resume entry point is not invalid
I think, you meant here - "is valid"
> 2) Other (if any) VCPUs of the calling guest are hot-unplugged
>
> If the conditions above are met the calling domain is labeled as
> suspended and the calling VCPU is blocked. If nothing else wouldn't
> be done the suspended domain would resume from the place where it
> called PSCI system suspend. This is expected if processing of the PSCI
> system suspend call fails.
Could you clarify my understanding here pls about implementation?
Note - below is related only to Linux Arm64 mostly
1) this patch alone, actually enables Suspend2ram of the quest domains, but not
a Xen System suspend2ram.
2) with this patch domain can actually enter suspend, but
- only from it's own console by issuing "echo mem > /sys/power/state"
-- guest end up in Xen by issuing vPSCI PSCI_1_0_FN64_SYSTEM_SUSPEND HVC
(psci is considered standard suspend mechanism for arm64)
-- Xen blocks last guest vcpu and suspend considered done
3) at this point guest is suspended, but no way to resume it.
4) Xen remote/external "control" interface is not available, so neither
suspend/neither resume can't be triggered from control domain using
"xl suspend/resume".
5) Xen System suspend can happens only from HWDOM when HWDOM is iteslf suspended.
The Xen system suspend process end up issuing PSCI SYSTEM_SUSPEND to EL3 FW.
There is a prerequisite requirement that all guests, except HWDOM, have been suspended already.
6) No wakeup-source abstraction is not defined for guests, so they can be resumed only manually
...
> However, in the case of success the calling
> guest should resume (continue execution after the wake-up) from the entry
> point which is given as the first argument of the PSCI system suspend
> call.
> In addition to the entry point, the guest expects to start within
> the environment whose state matches the state after reset.
> This means
> that the guest should find reset register values, MMU disabled, etc.
> Thereby, the context of VCPU should be 'reset' (as if the system is
> comming out of reset), the program counter should contain entry point,
> which is 1st argument, and r0/x0 should contain context ID which is 2nd
> argument of PSCI system suspend call.
I'm not sure that above is really needed in case of Xen guest domains, because:
- RAM state is retained
- Xen quest is running on virt CPU, interrupt-controller and timer, neither of them
will lose context comparing to the real HW.
As result, just ublocking vCPU(boot) will cause proper resume of the quest and it will
continue execution from the point of issuing vPSCI SYSTEM_SUSPEND HVC.
Actually I've tried:
- applied only patches 6 and 11
- applied test diff below
- triggered suspend in quest and then resume it by sending Xen "q" cmd
Guest can wake up.
(and no manipulations with vCPU state)
====
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -361,6 +361,7 @@ static void cf_check dump_domains(unsigned char key)
printk("Notifying guest %d:%d (virq %d, port %d)\n",
d->domain_id, v->vcpu_id,
VIRQ_DEBUG, v->virq_to_evtchn[VIRQ_DEBUG]);
+ vcpu_unblock(v);
send_guest_vcpu_virq(v, VIRQ_DEBUG);
}
}
Above should work nicely for Xen anaware guest, but Xen aware guest, specifically Linux,
need to be updated as drivers/xen/manage.c code doesn't support standard Suspend-2-ram
sequence properly - it's tied to hibernation.
It seems "xl suspend/resume" can't be used as is for ARM64, but many parts can be reused, probably.
I could be mistaken here - still studying interaction between control domain, remote domain and Xen.
One thing, I worry in case of Linux, is that changing System PM state
triggered by writing into some xenstore property and this happens inside Kernel,
while Linux, by design, expect changing System PM state only from User space,
at least it's true for suspend2ram which can be triggered only by:
- writing to /sys/power/state
- by auto-suspend + wakelocks.
> The context of VCPU is set
> accordingly when the PSCI system suspend is processed, so that nothing
> needs to be done on resume/wake-up path.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V3:
> Dropped all domain flags and related code (which touched common functions like
> vcpu_unblock), keeping only the necessary changes for Xen suspend/resume, i.e.
> suspend/resume is now fully supported only for the hardware domain.
> Proper support for domU suspend/resume will be added in a future patch.
> This patch does not yet include VCPU context reset or domain context
> restoration in VCPU.
> ---
> xen/arch/arm/Makefile | 1 +
> xen/arch/arm/include/asm/domain.h | 3 ++
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 2 +
> xen/arch/arm/include/asm/suspend.h | 18 +++++++
> xen/arch/arm/suspend.c | 67 +++++++++++++++++++++++++++
> xen/arch/arm/vpsci.c | 32 +++++++++++++
> 7 files changed, 124 insertions(+)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/arch/arm/suspend.c
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 43ab5e8f25..70d4b5daf8 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -53,6 +53,7 @@ obj-y += smpboot.o
> obj-$(CONFIG_STATIC_EVTCHN) += static-evtchn.init.o
> obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o
> obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o
> +obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
> obj-y += sysctl.o
> obj-y += time.o
> obj-y += traps.o
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 50b6a4b009..8b1bdf3d74 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -233,6 +233,9 @@ struct arch_vcpu
> struct vtimer virt_timer;
> bool vtimer_initialized;
>
> + register_t suspend_ep;
> + register_t suspend_cid;
> +
> /*
> * The full P2M may require some cleaning (e.g when emulation
> * set/way). As the action can take a long time, it requires
> diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
> index 3ab0391175..5049563718 100644
> --- a/xen/arch/arm/include/asm/perfc_defn.h
> +++ b/xen/arch/arm/include/asm/perfc_defn.h
> @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: system_reset")
> PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> PERFCOUNTER(vpsci_features, "vpsci: features")
> +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
>
> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>
> diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
> index 4780972621..48a93e6b79 100644
> --- a/xen/arch/arm/include/asm/psci.h
> +++ b/xen/arch/arm/include/asm/psci.h
> @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
>
> #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
>
> /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
> new file mode 100644
> index 0000000000..745377dbcf
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/suspend.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef __ASM_ARM_SUSPEND_H__
> +#define __ASM_ARM_SUSPEND_H__
> +
> +int32_t domain_suspend(register_t epoint, register_t cid);
> +
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> new file mode 100644
> index 0000000000..27fab8c999
> --- /dev/null
> +++ b/xen/arch/arm/suspend.c
> @@ -0,0 +1,67 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/sched.h>
> +#include <asm/cpufeature.h>
> +#include <asm/event.h>
> +#include <asm/psci.h>
> +#include <asm/suspend.h>
> +#include <asm/platform.h>
> +#include <public/sched.h>
> +
> +static void vcpu_suspend_prepare(register_t epoint, register_t cid)
> +{
> + struct vcpu *v = current;
> +
> + v->arch.suspend_ep = epoint;
> + v->arch.suspend_cid = cid;
> +}
> +
> +int32_t domain_suspend(register_t epoint, register_t cid)
> +{
> + struct vcpu *v;
> + struct domain *d = current->domain;
> + bool is_thumb = epoint & 1;
> +
> + dprintk(XENLOG_DEBUG,
> + "Dom%d suspend: epoint=0x%"PRIregister", cid=0x%"PRIregister"\n",
> + d->domain_id, epoint, cid);
> +
> + /* THUMB set is not allowed with 64-bit domain */
> + if ( is_64bit_domain(d) && is_thumb )
> + return PSCI_INVALID_ADDRESS;
> +
> + /* TODO: care about locking here */
> + /* Ensure that all CPUs other than the calling one are offline */
> + for_each_vcpu ( d, v )
> + {
> + if ( v != current && is_vcpu_online(v) )
> + return PSCI_DENIED;
> + }
> +
> + /*
> + * Prepare the calling VCPU for suspend (save entry point into pc and
> + * context ID into r0/x0 as specified by PSCI SYSTEM_SUSPEND)
> + */
> + vcpu_suspend_prepare(epoint, cid);
> +
> + /* Disable watchdogs of this domain */
> + watchdog_domain_suspend(d);
> +
> + /*
> + * The calling domain is suspended by blocking its last running VCPU. If an
> + * event is pending the domain will resume right away (VCPU will not block,
> + * but when scheduled in it will resume from the given entry point).
> + */
> + vcpu_block_unless_event_pending(current);
> +
> + return PSCI_SUCCESS;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> index d1615be8a6..96eef06c18 100644
> --- a/xen/arch/arm/vpsci.c
> +++ b/xen/arch/arm/vpsci.c
> @@ -7,6 +7,7 @@
> #include <asm/vgic.h>
> #include <asm/vpsci.h>
> #include <asm/event.h>
> +#include <asm/suspend.h>
>
> #include <public/sched.h>
>
> @@ -197,6 +198,15 @@ static void do_psci_0_2_system_reset(void)
> domain_shutdown(d,SHUTDOWN_reboot);
> }
>
> +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
> +{
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + return domain_suspend(epoint, cid);
> +#else
> + return PSCI_NOT_SUPPORTED;
> +#endif
> +}
> +
> static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> {
> /* /!\ Ordered by function ID and not name */
> @@ -214,6 +224,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> case PSCI_0_2_FN32_SYSTEM_OFF:
> case PSCI_0_2_FN32_SYSTEM_RESET:
> case PSCI_1_0_FN32_PSCI_FEATURES:
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> case ARM_SMCCC_VERSION_FID:
> return 0;
> default:
> @@ -344,6 +356,26 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
> return true;
> }
>
> + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> + {
> + register_t epoint = PSCI_ARG(regs,1);
> + register_t cid = PSCI_ARG(regs,2);
> + register_t ret;
> +
> + perfc_incr(vpsci_system_suspend);
> + /* Set the result to PSCI_SUCCESS if the call fails.
> + * Otherwise preserve the context_id in x0. For now
> + * we don't support the case where the system is suspended
> + * to a shallower level and PSCI_SUCCESS is returned to the
> + * caller.
> + */
> + ret = do_psci_1_0_system_suspend(epoint, cid);
> + if ( ret != PSCI_SUCCESS )
> + PSCI_SET_RESULT(regs, ret);
> + return true;
> + }
> +
> default:
> return false;
> }
--
Best regards,
-grygorii
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 12/16] xen/arm: Trigger Xen suspend when hardware domain completes suspend
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (10 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 11/16] xen/arm: Implement PSCI system suspend Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 9:11 ` [PATCH 13/16] xen/arm: Implement PSCI SYSTEM_SUSPEND call (physical interface) Mykola Kvach
` (5 subsequent siblings)
17 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
When hardware domain finalizes its suspend procedure the suspend of Xen
is triggered by calling system_suspend(). Hardware domain finalizes the
suspend from its boot core (VCPU#0), which could be mapped to any physical
CPU, i.e. the system_suspend() function could be executed by any physical
CPU. Since Xen suspend procedure has to be run by the boot CPU
(non-boot CPUs will be disabled at some point in suspend procedure),
system_suspend() execution has to continue on CPU#0.
When the system_suspend() returns 0, it means that the system was
suspended and it is coming out of the resume procedure. Regardless
of the system_suspend() return value, after this function returns
Xen is fully functional, and its state, including all devices and data
structures, matches the state prior to calling system_suspend().
The status is returned by system_suspend() for debugging/logging
purposes and function prototype compatibility.
This patch also introduces some state changes in peripherals and CPUs
during suspend/resume. Specifically, it:
- disable/enable non-boot physical CPUs, freeze/thaw domains;
- suspend/resume the timer, GIC, console, IOMMU, and hardware domain.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes introduced in V3:
Merged changes from other commits into this one (stashed changes):
- disabled/enabled non-boot physical CPUs and froze/thawed domains;
- suspended/resumed the timer, GIC, console, IOMMU, and hardware domain.
---
xen/arch/arm/suspend.c | 233 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 233 insertions(+)
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index 27fab8c999..fa81be5a4f 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -1,6 +1,9 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/sched.h>
+#include <xen/cpu.h>
+#include <xen/console.h>
+#include <xen/iommu.h>
#include <asm/cpufeature.h>
#include <asm/event.h>
#include <asm/psci.h>
@@ -8,6 +11,210 @@
#include <asm/platform.h>
#include <public/sched.h>
+/* Reset values of VCPU architecture specific registers */
+static void vcpu_arch_reset(struct vcpu *v)
+{
+ v->arch.ttbr0 = 0;
+ v->arch.ttbr1 = 0;
+ v->arch.ttbcr = 0;
+
+ v->arch.csselr = 0;
+ v->arch.cpacr = 0;
+ v->arch.contextidr = 0;
+ v->arch.tpidr_el0 = 0;
+ v->arch.tpidrro_el0 = 0;
+ v->arch.tpidr_el1 = 0;
+ v->arch.vbar = 0;
+ v->arch.dacr = 0;
+ v->arch.par = 0;
+#if defined(CONFIG_ARM_32)
+ v->arch.mair0 = 0;
+ v->arch.mair1 = 0;
+ v->arch.amair0 = 0;
+ v->arch.amair1 = 0;
+#else
+ v->arch.mair = 0;
+ v->arch.amair = 0;
+#endif
+ /* Fault Status */
+#if defined(CONFIG_ARM_32)
+ v->arch.dfar = 0;
+ v->arch.ifar = 0;
+ v->arch.dfsr = 0;
+#elif defined(CONFIG_ARM_64)
+ v->arch.far = 0;
+ v->arch.esr = 0;
+#endif
+
+ v->arch.ifsr = 0;
+ v->arch.afsr0 = 0;
+ v->arch.afsr1 = 0;
+
+#ifdef CONFIG_ARM_32
+ v->arch.joscr = 0;
+ v->arch.jmcr = 0;
+#endif
+
+ v->arch.teecr = 0;
+ v->arch.teehbr = 0;
+}
+
+/*
+ * This function sets the context of current VCPU to the state which is expected
+ * by the guest on resume. The expected VCPU state is:
+ * 1) pc to contain resume entry point (1st argument of PSCI SYSTEM_SUSPEND)
+ * 2) r0/x0 to contain context ID (2nd argument of PSCI SYSTEM_SUSPEND)
+ * 3) All other general purpose and system registers should have reset values
+ */
+static void vcpu_resume(struct vcpu *v)
+{
+
+ struct vcpu_guest_context ctxt;
+
+ /* Make sure that VCPU guest regs are zeroed */
+ memset(&ctxt, 0, sizeof(ctxt));
+
+ /* Set non-zero values to the registers prior to copying */
+ ctxt.user_regs.pc64 = (u64)v->arch.suspend_ep;
+
+ /* TODO: test changes on 32-bit domain */
+ if ( is_32bit_domain(v->domain) )
+ {
+ ctxt.user_regs.r0_usr = v->arch.suspend_cid;
+ ctxt.user_regs.cpsr = PSR_GUEST32_INIT;
+
+ /* Thumb set is allowed only for 32-bit domain */
+ if ( v->arch.suspend_ep & 1 )
+ {
+ ctxt.user_regs.cpsr |= PSR_THUMB;
+ ctxt.user_regs.pc64 &= ~(u64)1;
+ }
+ }
+#ifdef CONFIG_ARM_64
+ else
+ {
+ ctxt.user_regs.x0 = v->arch.suspend_cid;
+ ctxt.user_regs.cpsr = PSR_GUEST64_INIT;
+ }
+#endif
+ ctxt.sctlr = SCTLR_GUEST_INIT;
+ ctxt.flags = VGCF_online;
+
+ /* Reset architecture specific registers */
+ vcpu_arch_reset(v);
+
+ /* Initialize VCPU registers */
+ domain_lock(v->domain);
+ arch_set_info_guest(v, &ctxt);
+ domain_unlock(v->domain);
+ watchdog_domain_resume(v->domain);
+}
+
+/* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
+static long system_suspend(void *data)
+{
+ int status;
+ unsigned long flags;
+
+ BUG_ON(system_state != SYS_STATE_active);
+
+ system_state = SYS_STATE_suspend;
+ freeze_domains();
+
+ /*
+ * Non-boot CPUs have to be disabled on suspend and enabled on resume
+ * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
+ * CPU_OFF to be called by each non-boot CPU. Depending on the underlying
+ * platform capabilities, this may lead to the physical powering down of
+ * CPUs. Tested on Xilinx Zynq Ultrascale+ MPSoC (including power down of
+ * each non-boot CPU).
+ */
+ status = disable_nonboot_cpus();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_nonboot_cpus;
+ }
+
+ time_suspend();
+
+ local_irq_save(flags);
+ status = gic_suspend();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_irqs;
+ }
+
+ printk("Xen suspending...\n");
+ console_start_sync();
+
+ status = console_suspend();
+ if ( status )
+ {
+ dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", status);
+ system_state = SYS_STATE_resume;
+ goto resume_console;
+ }
+
+ status = iommu_suspend();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_console;
+ }
+
+ /*
+ * Enable identity mapping before entering suspend to simplify
+ * the resume path
+ */
+ update_boot_mapping(true);
+
+ system_state = SYS_STATE_resume;
+ update_boot_mapping(false);
+
+ iommu_resume();
+
+resume_console:
+ console_resume();
+
+ gic_resume();
+
+resume_irqs:
+ local_irq_restore(flags);
+
+ time_resume();
+
+resume_nonboot_cpus:
+ /*
+ * The rcu_barrier() has to be added to ensure that the per cpu area is
+ * freed before a non-boot CPU tries to initialize it (_free_percpu_area()
+ * has to be called before the init_percpu_area()). This scenario occurs
+ * when non-boot CPUs are hot-unplugged on suspend and hotplugged on resume.
+ */
+ rcu_barrier();
+ enable_nonboot_cpus();
+ thaw_domains();
+ system_state = SYS_STATE_active;
+
+ /*
+ * The hardware domain owns most of the devices and may be part of the
+ * suspend/resume path. Since the hardware domain suspend is tied to
+ * the host suspend, it makes sense to resume it at the same time,
+ * i.e. after host resumes.
+ */
+ vcpu_resume(hardware_domain->vcpu[0]);
+ /*
+ * The resume of hardware domain should always follow Xen's resume.
+ * This is done by unblocking the first vCPU of Dom0.
+ */
+ vcpu_unblock(hardware_domain->vcpu[0]);
+
+ printk("Resume (status %d)\n", status);
+
+ return status;
+}
+
static void vcpu_suspend_prepare(register_t epoint, register_t cid)
{
struct vcpu *v = current;
@@ -21,6 +228,7 @@ int32_t domain_suspend(register_t epoint, register_t cid)
struct vcpu *v;
struct domain *d = current->domain;
bool is_thumb = epoint & 1;
+ int status;
dprintk(XENLOG_DEBUG,
"Dom%d suspend: epoint=0x%"PRIregister", cid=0x%"PRIregister"\n",
@@ -54,6 +262,31 @@ int32_t domain_suspend(register_t epoint, register_t cid)
*/
vcpu_block_unless_event_pending(current);
+ /* If this was dom0 the whole system should suspend: trigger Xen suspend */
+ if ( is_hardware_domain(d) )
+ {
+ /*
+ * system_suspend should be called when Dom0 finalizes the suspend
+ * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
+ * be mapped to any PCPU (this function could be executed by any PCPU).
+ * The suspend procedure has to be finalized by the PCPU#0 (non-boot
+ * PCPUs will be disabled during the suspend).
+ */
+ status = continue_hypercall_on_cpu(0, system_suspend, NULL);
+ /*
+ * If an error happened, there is nothing that needs to be done here
+ * because the system_suspend always returns in fully functional state
+ * no matter what the outcome of suspend procedure is. If the system
+ * suspended successfully the function will return 0 after the resume.
+ * Otherwise, if an error is returned it means Xen did not suspended,
+ * but it is still in the same state as if the system_suspend was never
+ * called. We dump a debug message in case of an error for debugging/
+ * logging purpose.
+ */
+ if ( status )
+ dprintk(XENLOG_ERR, "Failed to suspend, errno=%d\n", status);
+ }
+
return PSCI_SUCCESS;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* [PATCH 13/16] xen/arm: Implement PSCI SYSTEM_SUSPEND call (physical interface)
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (11 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 12/16] xen/arm: Trigger Xen suspend when hardware domain completes suspend Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 9:11 ` [PATCH 14/16] xen/arm: Resume memory management on Xen resume Mykola Kvach
` (4 subsequent siblings)
17 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
PSCI system suspend function shall be invoked to finalize Xen suspend
procedure. Resume entry point, which needs to be passed via 1st argument
of PSCI system suspend call to the EL3, is hyp_resume. For now, hyp_resume
is just a placeholder that will be implemented in assembly. Context ID,
which is 2nd argument of system suspend PSCI call, is unused, as in Linux.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes introduced in v3:
- return PSCI_NOT_SUPPORTED instead of 1 for arm32
- add checking of PSCI version for psci suspend call
---
xen/arch/arm/arm64/head.S | 8 ++++++++
xen/arch/arm/include/asm/psci.h | 1 +
xen/arch/arm/include/asm/suspend.h | 1 +
xen/arch/arm/psci.c | 19 +++++++++++++++++++
xen/arch/arm/suspend.c | 4 ++++
5 files changed, 33 insertions(+)
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 72c7b24498..3522c497c5 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -561,6 +561,14 @@ END(efi_xen_start)
#endif /* CONFIG_ARM_EFI */
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+FUNC(hyp_resume)
+ b .
+END(hyp_resume)
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
/*
* Local variables:
* mode: ASM
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 48a93e6b79..15eb2c6013 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -20,6 +20,7 @@ extern uint32_t psci_ver;
int psci_init(void);
int call_psci_cpu_on(int cpu);
+int call_psci_system_suspend(void);
void call_psci_cpu_off(void);
void call_psci_system_off(void);
void call_psci_system_reset(void);
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 745377dbcf..0d2f0da0ad 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -4,6 +4,7 @@
#define __ASM_ARM_SUSPEND_H__
int32_t domain_suspend(register_t epoint, register_t cid);
+void hyp_resume(void);
#endif
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index b6860a7760..8e9c571467 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -17,6 +17,7 @@
#include <asm/cpufeature.h>
#include <asm/psci.h>
#include <asm/acpi.h>
+#include <asm/suspend.h>
/*
* While a 64-bit OS can make calls with SMC32 calling conventions, for
@@ -60,6 +61,24 @@ void call_psci_cpu_off(void)
}
}
+int call_psci_system_suspend(void)
+{
+#ifdef CONFIG_SYSTEM_SUSPEND
+ struct arm_smccc_res res;
+
+ if ( psci_ver < PSCI_VERSION(1, 0) )
+ return PSCI_NOT_SUPPORTED;
+
+ /* 2nd argument (context ID) is not used */
+ arm_smccc_smc(PSCI_1_0_FN64_SYSTEM_SUSPEND, __pa(hyp_resume), &res);
+
+ return PSCI_RET(res);
+#else
+ /* not supported */
+ return PSCI_NOT_SUPPORTED;
+#endif
+}
+
void call_psci_system_off(void)
{
if ( psci_ver > PSCI_VERSION(0, 1) )
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index fa81be5a4f..ac88faee2e 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -170,6 +170,10 @@ static long system_suspend(void *data)
*/
update_boot_mapping(true);
+ status = call_psci_system_suspend();
+ if ( status )
+ dprintk(XENLOG_ERR, "PSCI system suspend failed, err=%d\n", status);
+
system_state = SYS_STATE_resume;
update_boot_mapping(false);
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* [PATCH 14/16] xen/arm: Resume memory management on Xen resume
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (12 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 13/16] xen/arm: Implement PSCI SYSTEM_SUSPEND call (physical interface) Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 23:04 ` Julien Grall
2025-03-05 9:11 ` [PATCH 15/16] xen/arm: Save/restore context on suspend/resume Mykola Kvach
` (3 subsequent siblings)
17 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
The MMU needs to be enabled in the resume flow before the context
can be restored (we need to be able to access the context data by
virtual address in order to restore it). The configuration of system
registers prior to branching to the routine that sets up the page
tables is copied from xen/arch/arm/arm64/head.S.
After the MMU is enabled, the content of TTBR0_EL2 is changed to
point to init_ttbr (page tables used at runtime).
At boot the init_ttbr variable is updated when a secondary CPU is
hotplugged. In the scenario where there is only one physical CPU in
the system, the init_ttbr would not be initialized for the use in
resume flow. To get the variable initialized in all scenarios in this
patch we add that the boot CPU updates init_ttbr after it sets the
page tables for runtime.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V3:
- updated commit message
- instead of using create_page_tables, enable_mmu, and mmu_init_secondary_cpu,
the existing function enable_secondary_cpu_mm is now used
- prepare_secondary_mm (previously init_secondary_pagetables in the previous
patch series) is now called at the end of start_xen instead of
setup_pagetables. Calling it in the previous location caused a crash
- add early printk init during resume
Changes in V2:
- moved hyp_resume to head.S to place it near the rest of the start code
- simplified the code in hyp_resume by using existing functions such as
check_cpu_mode, cpu_init, create_page_tables, and enable_mmu
---
xen/arch/arm/arm64/head.S | 23 +++++++++++++++++++++++
xen/arch/arm/setup.c | 8 ++++++++
2 files changed, 31 insertions(+)
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 3522c497c5..fab2812e54 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -564,6 +564,29 @@ END(efi_xen_start)
#ifdef CONFIG_SYSTEM_SUSPEND
FUNC(hyp_resume)
+ msr DAIFSet, 0xf /* Disable all interrupts */
+
+ tlbi alle2
+ dsb sy /* Ensure completion of TLB flush */
+ isb
+
+ ldr x0, =start
+ adr x19, start /* x19 := paddr (start) */
+ sub x20, x19, x0 /* x20 := phys-offset */
+
+ /* Initialize the UART if earlyprintk has been enabled. */
+#ifdef CONFIG_EARLY_PRINTK
+ bl init_uart
+#endif
+ PRINT_ID("- Xen resuming -\r\n")
+
+ bl check_cpu_mode
+ bl cpu_init
+
+ ldr lr, =mmu_resumed
+ b enable_secondary_cpu_mm
+
+mmu_resumed:
b .
END(hyp_resume)
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index ffcae900d7..3a89ac436b 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -508,6 +508,14 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)
for_each_domain( d )
domain_unpause_by_systemcontroller(d);
+#ifdef CONFIG_SYSTEM_SUSPEND
+ /*
+ * It is called to initialize init_ttbr.
+ * Without this call, Xen gets stuck after resuming.
+ */
+ prepare_secondary_mm(0);
+#endif
+
/* Switch on to the dynamically allocated stack for the idle vcpu
* since the static one we're running on is about to be freed. */
memcpy(idle_vcpu[0]->arch.cpu_info, get_cpu_info(),
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 14/16] xen/arm: Resume memory management on Xen resume
2025-03-05 9:11 ` [PATCH 14/16] xen/arm: Resume memory management on Xen resume Mykola Kvach
@ 2025-03-11 23:04 ` Julien Grall
2025-05-31 10:16 ` Mykola Kvach
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 23:04 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi, Mykyta Poturai,
Mykola Kvach
Hi,
On 05/03/2025 09:11, Mykola Kvach wrote:
> From: Mirela Simonovic <mirela.simonovic@aggios.com>
>
> The MMU needs to be enabled in the resume flow before the context
> can be restored (we need to be able to access the context data by
> virtual address in order to restore it). The configuration of system
> registers prior to branching to the routine that sets up the page
> tables is copied from xen/arch/arm/arm64/head.S.
> After the MMU is enabled, the content of TTBR0_EL2 is changed to
> point to init_ttbr (page tables used at runtime).
>
> At boot the init_ttbr variable is updated when a secondary CPU is
> hotplugged. In the scenario where there is only one physical CPU in
> the system, the init_ttbr would not be initialized for the use in
> resume flow. To get the variable initialized in all scenarios in this
> patch we add that the boot CPU updates init_ttbr after it sets the
> page tables for runtime.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V3:
> - updated commit message
> - instead of using create_page_tables, enable_mmu, and mmu_init_secondary_cpu,
> the existing function enable_secondary_cpu_mm is now used
> - prepare_secondary_mm (previously init_secondary_pagetables in the previous
> patch series) is now called at the end of start_xen instead of
> setup_pagetables. Calling it in the previous location caused a crash
> - add early printk init during resume
>
> Changes in V2:
> - moved hyp_resume to head.S to place it near the rest of the start code
> - simplified the code in hyp_resume by using existing functions such as
> check_cpu_mode, cpu_init, create_page_tables, and enable_mmu
> ---
> xen/arch/arm/arm64/head.S | 23 +++++++++++++++++++++++
> xen/arch/arm/setup.c | 8 ++++++++
> 2 files changed, 31 insertions(+)
>
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 3522c497c5..fab2812e54 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -564,6 +564,29 @@ END(efi_xen_start)
> #ifdef CONFIG_SYSTEM_SUSPEND
>
> FUNC(hyp_resume)
> + msr DAIFSet, 0xf /* Disable all interrupts */
Surely we should be here with interrupts disabled. No?
> +
> + tlbi alle2
> + dsb sy /* Ensure completion of TLB flush */
This doesn't exist when booting Xen and I am not sure why we would need
it for resume as we already nuke the TLbs in enable_mmu. Can you clarify?
> + isb
> +
> + ldr x0, =start
> + adr x19, start /* x19 := paddr (start) */
> + sub x20, x19, x0 /* x20 := phys-offset */
Looking at the code, I wonder if it is still necessary to set x19 and
x20 for booting secondary CPUs and resume. There doesn't seem to be any
use of the registers.
> +
> + /* Initialize the UART if earlyprintk has been enabled. */
> +#ifdef CONFIG_EARLY_PRINTK
> + bl init_uart
> +#endif
> + PRINT_ID("- Xen resuming -\r\n")
> +
> + bl check_cpu_mode
> + bl cpu_init
> +
> + ldr lr, =mmu_resumed
> + b enable_secondary_cpu_mm
> +
> +mmu_resumed:
> b .
> END(hyp_resume)
>
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index ffcae900d7..3a89ac436b 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -508,6 +508,14 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)
> for_each_domain( d )
> domain_unpause_by_systemcontroller(d);
>
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + /*
> + * It is called to initialize init_ttbr.
> + * Without this call, Xen gets stuck after resuming.
This is not a very descriptive comment. But, you seem to make the
assumption that prepare_secondary_mm() can be called on the boot CPU.
This is not always the case. For instance on arm32, it will allocate
memory and overwrite the per-cpu page-tables pointer (not great). This
will also soon be the case for arm64.
Furthermore, this call reminded me that the secondary CPU page-tables
are not freed when turning off a CPU. So they will leak. Not yet a
problem for arm64 though.
So overall, I think we need a separate function that will be prepare
init_ttbr for a given CPU (not allocate any memory). This will then need
to be called from the suspend helper.
> + */
> + prepare_secondary_mm(0);> +#endif
> +
> /* Switch on to the dynamically allocated stack for the idle vcpu
> * since the static one we're running on is about to be freed. */
> memcpy(idle_vcpu[0]->arch.cpu_info, get_cpu_info(),
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 14/16] xen/arm: Resume memory management on Xen resume
2025-03-11 23:04 ` Julien Grall
@ 2025-05-31 10:16 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-05-31 10:16 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, Mirela Simonovic, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi, Mykyta Poturai,
Mykola Kvach
Hi, @Julien Grall
On Wed, Mar 12, 2025 at 1:04 AM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 05/03/2025 09:11, Mykola Kvach wrote:
> > From: Mirela Simonovic <mirela.simonovic@aggios.com>
> >
> > The MMU needs to be enabled in the resume flow before the context
> > can be restored (we need to be able to access the context data by
> > virtual address in order to restore it). The configuration of system
> > registers prior to branching to the routine that sets up the page
> > tables is copied from xen/arch/arm/arm64/head.S.
> > After the MMU is enabled, the content of TTBR0_EL2 is changed to
> > point to init_ttbr (page tables used at runtime).
> >
> > At boot the init_ttbr variable is updated when a secondary CPU is
> > hotplugged. In the scenario where there is only one physical CPU in
> > the system, the init_ttbr would not be initialized for the use in
> > resume flow. To get the variable initialized in all scenarios in this
> > patch we add that the boot CPU updates init_ttbr after it sets the
> > page tables for runtime.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V3:
> > - updated commit message
> > - instead of using create_page_tables, enable_mmu, and mmu_init_secondary_cpu,
> > the existing function enable_secondary_cpu_mm is now used
> > - prepare_secondary_mm (previously init_secondary_pagetables in the previous
> > patch series) is now called at the end of start_xen instead of
> > setup_pagetables. Calling it in the previous location caused a crash
> > - add early printk init during resume
> >
> > Changes in V2:
> > - moved hyp_resume to head.S to place it near the rest of the start code
> > - simplified the code in hyp_resume by using existing functions such as
> > check_cpu_mode, cpu_init, create_page_tables, and enable_mmu
> > ---
> > xen/arch/arm/arm64/head.S | 23 +++++++++++++++++++++++
> > xen/arch/arm/setup.c | 8 ++++++++
> > 2 files changed, 31 insertions(+)
> >
> > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > index 3522c497c5..fab2812e54 100644
> > --- a/xen/arch/arm/arm64/head.S
> > +++ b/xen/arch/arm/arm64/head.S
> > @@ -564,6 +564,29 @@ END(efi_xen_start)
> > #ifdef CONFIG_SYSTEM_SUSPEND
> >
> > FUNC(hyp_resume)
> > + msr DAIFSet, 0xf /* Disable all interrupts */
>
> Surely we should be here with interrupts disabled. No?
You are right, I overlooked this and left the code unchanged from a
previous patch series.
According to the Power State Coordination Interface (DEN0022F.b 1.3):
```
6.4.3.3 Exceptions
The appropriate asynchronous exception masks must be set when starting
up the core at the Exception
level from which the call was made, or at ELns. Typically, this means
that for the Exception level where
execution is restarting:
If starting in AArch64 state, the SPSR_ELx.{D,A,I,F} bits must be set
to {1, 1, 1, 1}. ELx is the
Exception level being returned from.
```
The ARM Trusted Firmware code enforces this correctly here:
https://elixir.bootlin.com/arm-trusted-firmware/v2.13.0/source/lib/psci/psci_common.c#L869
So, the code should indeed expect DAIF bits to be set on resume. I
will update the patch accordingly.
>
> > +
> > + tlbi alle2
> > + dsb sy /* Ensure completion of TLB flush */
>
> This doesn't exist when booting Xen and I am not sure why we would need
> it for resume as we already nuke the TLbs in enable_mmu. Can you clarify?
You're absolutely right — that line is a leftover from earlier changes
when the memory handling logic was being reorganized.
It's no longer necessary because enable_mmu already handles TLB
invalidation, including the TLBI and DSB instructions.
I'll remove it in the next version. Thanks for catching this!
>
> > + isb
> > +
> > + ldr x0, =start
> > + adr x19, start /* x19 := paddr (start) */
> > + sub x20, x19, x0 /* x20 := phys-offset */
>
> Looking at the code, I wonder if it is still necessary to set x19 and
> x20 for booting secondary CPUs and resume. There doesn't seem to be any
> use of the registers.
x20 is still required during resume. It's used inside
enable_secondary_cpu_mm via the load_paddr macro.
So although x19 may no longer be necessary in this context, x20 is
still used and needs to be set beforehand.
>
> > +
> > + /* Initialize the UART if earlyprintk has been enabled. */
> > +#ifdef CONFIG_EARLY_PRINTK
> > + bl init_uart
> > +#endif
> > + PRINT_ID("- Xen resuming -\r\n")
> > +
> > + bl check_cpu_mode
> > + bl cpu_init
> > +
> > + ldr lr, =mmu_resumed
> > + b enable_secondary_cpu_mm
> > +
> > +mmu_resumed:
> > b .
> > END(hyp_resume)
> >
> > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> > index ffcae900d7..3a89ac436b 100644
> > --- a/xen/arch/arm/setup.c
> > +++ b/xen/arch/arm/setup.c
> > @@ -508,6 +508,14 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)
> > for_each_domain( d )
> > domain_unpause_by_systemcontroller(d);
> >
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > + /*
> > + * It is called to initialize init_ttbr.
> > + * Without this call, Xen gets stuck after resuming.
>
> This is not a very descriptive comment. But, you seem to make the
> assumption that prepare_secondary_mm() can be called on the boot CPU.
> This is not always the case. For instance on arm32, it will allocate
> memory and overwrite the per-cpu page-tables pointer (not great). This
> will also soon be the case for arm64.
>
> Furthermore, this call reminded me that the secondary CPU page-tables
> are not freed when turning off a CPU. So they will leak. Not yet a
> problem for arm64 though.
>
> So overall, I think we need a separate function that will be prepare
> init_ttbr for a given CPU (not allocate any memory). This will then need
> to be called from the suspend helper.
Thank you for the detailed explanation.
I will rework this part to avoid calling prepare_secondary_mm() on
the boot CPU. As suggested, I plan to introduce a dedicated helper
function that will only initialize init_ttbr without allocating
memory and call it from the suspend helper.
>
> > + */
> > + prepare_secondary_mm(0);> +#endif
> > +
> > /* Switch on to the dynamically allocated stack for the idle vcpu
> > * since the static one we're running on is about to be freed. */
> > memcpy(idle_vcpu[0]->arch.cpu_info, get_cpu_info(),
>
> Cheers,
>
> --
> Julien Grall
>
Thank you for reviewing this patch series.
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 15/16] xen/arm: Save/restore context on suspend/resume
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (13 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 14/16] xen/arm: Resume memory management on Xen resume Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-05 9:11 ` [PATCH 16/16] CHANGELOG: Mention Xen suspend/resume to RAM feature on arm64 Mykola Kvach
` (2 subsequent siblings)
17 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel
Cc: Mirela Simonovic, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Saeed Nowshadi,
Mykyta Poturai, Mykola Kvach
From: Mirela Simonovic <mirela.simonovic@aggios.com>
The context of CPU general purpose and system control registers
has to be saved on suspend and restored on resume. This is
implemented in hyp_suspend and before the return from hyp_resume
function. The hyp_suspend is invoked just before the PSCI system
suspend call is issued to the ATF. The hyp_suspend has to return a
non-zero value so that the calling 'if' statement evaluates to true,
causing the system suspend to be invoked. Upon the resume, context
saved on suspend will be restored, including the link register.
Therefore, after restoring the context the control flow will
return to the address pointed by the saved link register, which
is the place from which the hyp_suspend was called. To ensure
that the calling 'if' statement doesn't again evaluate to true
and initiate system suspend, hyp_resume has to return a zero value
after restoring the context.
Note that the order of saving register context into cpu_context
structure has to match the order of restoring.
Since the suspend/resume is supported only for arm64, we define
a null cpu_context structure so arm32 could compile.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
xen/arch/arm/arm64/head.S | 88 +++++++++++++++++++++++++++++-
xen/arch/arm/include/asm/suspend.h | 22 ++++++++
xen/arch/arm/suspend.c | 22 +++++++-
3 files changed, 128 insertions(+), 4 deletions(-)
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index fab2812e54..c10eec751b 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -562,6 +562,52 @@ END(efi_xen_start)
#endif /* CONFIG_ARM_EFI */
#ifdef CONFIG_SYSTEM_SUSPEND
+/*
+ * int32_t hyp_suspend(struct cpu_context *ptr)
+ *
+ * x0 - pointer to the storage where callee's context will be saved
+ *
+ * CPU context saved here will be restored on resume in hyp_resume function.
+ * hyp_suspend shall return a non-zero value. Upon restoring context
+ * hyp_resume shall return value zero instead. From C code that invokes
+ * hyp_suspend, the return value is interpreted to determine whether the context
+ * is saved (hyp_suspend) or restored (hyp_resume).
+ */
+FUNC(hyp_suspend)
+ /* Store callee-saved registers */
+ stp x19, x20, [x0], #16
+ stp x21, x22, [x0], #16
+ stp x23, x24, [x0], #16
+ stp x25, x26, [x0], #16
+ stp x27, x28, [x0], #16
+ stp x29, lr, [x0], #16
+
+ /* Store stack-pointer */
+ mov x2, sp
+ str x2, [x0], #8
+
+ /* Store system control registers */
+ mrs x2, VBAR_EL2
+ str x2, [x0], #8
+ mrs x2, VTCR_EL2
+ str x2, [x0], #8
+ mrs x2, VTTBR_EL2
+ str x2, [x0], #8
+ mrs x2, TPIDR_EL2
+ str x2, [x0], #8
+ mrs x2, MDCR_EL2
+ str x2, [x0], #8
+ mrs x2, HSTR_EL2
+ str x2, [x0], #8
+ mrs x2, CPTR_EL2
+ str x2, [x0], #8
+ mrs x2, HCR_EL2
+ str x2, [x0], #8
+
+ /* hyp_suspend must return a non-zero value */
+ mov x0, #1
+ ret
+END(hyp_suspend)
FUNC(hyp_resume)
msr DAIFSet, 0xf /* Disable all interrupts */
@@ -587,7 +633,47 @@ FUNC(hyp_resume)
b enable_secondary_cpu_mm
mmu_resumed:
- b .
+ /* Now we can access the cpu_context, so restore the context here */
+ ldr x0, =cpu_context
+
+ /* Restore callee-saved registers */
+ ldp x19, x20, [x0], #16
+ ldp x21, x22, [x0], #16
+ ldp x23, x24, [x0], #16
+ ldp x25, x26, [x0], #16
+ ldp x27, x28, [x0], #16
+ ldp x29, lr, [x0], #16
+
+ /* Restore stack pointer */
+ ldr x2, [x0], #8
+ mov sp, x2
+
+ /* Restore system control registers */
+ ldr x2, [x0], #8
+ msr VBAR_EL2, x2
+ ldr x2, [x0], #8
+ msr VTCR_EL2, x2
+ ldr x2, [x0], #8
+ msr VTTBR_EL2, x2
+ ldr x2, [x0], #8
+ msr TPIDR_EL2, x2
+ ldr x2, [x0], #8
+ msr MDCR_EL2, x2
+ ldr x2, [x0], #8
+ msr HSTR_EL2, x2
+ ldr x2, [x0], #8
+ msr CPTR_EL2, x2
+ ldr x2, [x0], #8
+ msr HCR_EL2, x2
+ isb
+
+ /* Since context is restored return from this function will appear as
+ * return from hyp_suspend. To distinguish a return from hyp_suspend
+ * which is called upon finalizing the suspend, as opposed to return
+ * from this function which executes on resume, we need to return zero
+ * value here. */
+ mov x0, #0
+ ret
END(hyp_resume)
#endif /* CONFIG_SYSTEM_SUSPEND */
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
index 0d2f0da0ad..1d98acacc6 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -3,8 +3,30 @@
#ifndef __ASM_ARM_SUSPEND_H__
#define __ASM_ARM_SUSPEND_H__
+#ifdef CONFIG_ARM_64
+struct cpu_context {
+ uint64_t callee_regs[12];
+ uint64_t sp;
+ uint64_t vbar_el2;
+ uint64_t vtcr_el2;
+ uint64_t vttbr_el2;
+ uint64_t tpidr_el2;
+ uint64_t mdcr_el2;
+ uint64_t hstr_el2;
+ uint64_t cptr_el2;
+ uint64_t hcr_el2;
+} __aligned(16);
+#else
+struct cpu_context {
+ uint8_t pad;
+};
+#endif
+
+extern struct cpu_context cpu_context;
+
int32_t domain_suspend(register_t epoint, register_t cid);
void hyp_resume(void);
+int32_t hyp_suspend(struct cpu_context *ptr);
#endif
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index ac88faee2e..72eeca3fdf 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -11,6 +11,8 @@
#include <asm/platform.h>
#include <public/sched.h>
+struct cpu_context cpu_context;
+
/* Reset values of VCPU architecture specific registers */
static void vcpu_arch_reset(struct vcpu *v)
{
@@ -170,9 +172,23 @@ static long system_suspend(void *data)
*/
update_boot_mapping(true);
- status = call_psci_system_suspend();
- if ( status )
- dprintk(XENLOG_ERR, "PSCI system suspend failed, err=%d\n", status);
+ if ( hyp_suspend(&cpu_context) )
+ {
+ status = call_psci_system_suspend();
+ /*
+ * If suspend is finalized properly by above system suspend PSCI call,
+ * the code below in this 'if' branch will never execute. Execution
+ * will continue from hyp_resume which is the hypervisor's resume point.
+ * In hyp_resume CPU context will be restored and since link-register is
+ * restored as well, it will appear to return from hyp_suspend. The
+ * difference in returning from hyp_suspend on system suspend versus
+ * resume is in function's return value: on suspend, the return value is
+ * a non-zero value, on resume it is zero. That is why the control flow
+ * will not re-enter this 'if' branch on resume.
+ */
+ if ( status )
+ dprintk(XENLOG_ERR, "PSCI system suspend failed, err=%d\n", status);
+ }
system_state = SYS_STATE_resume;
update_boot_mapping(false);
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* [PATCH 16/16] CHANGELOG: Mention Xen suspend/resume to RAM feature on arm64
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (14 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 15/16] xen/arm: Save/restore context on suspend/resume Mykola Kvach
@ 2025-03-05 9:11 ` Mykola Kvach
2025-03-11 10:33 ` Oleksii Kurochko
2025-03-05 13:14 ` [PATCH 00/16] Suspend to RAM support for Xen " Mykola Kvach
2025-03-11 22:35 ` Julien Grall
17 siblings, 1 reply; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 9:11 UTC (permalink / raw)
To: xen-devel; +Cc: Oleksii Kurochko, Community Manager, Mykola Kvach
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
CHANGELOG.md | 2 ++
1 file changed, 2 insertions(+)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 04c21d5bce..489404fc8b 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
## [4.21.0 UNRELEASED](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=shortlog;h=staging) - TBD
### Changed
+ - On Arm:
+ - Support for suspend/resume to/from RAM on arm64
### Added
--
2.43.0
^ permalink raw reply related [flat|nested] 69+ messages in thread* Re: [PATCH 00/16] Suspend to RAM support for Xen on arm64
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (15 preceding siblings ...)
2025-03-05 9:11 ` [PATCH 16/16] CHANGELOG: Mention Xen suspend/resume to RAM feature on arm64 Mykola Kvach
@ 2025-03-05 13:14 ` Mykola Kvach
2025-03-11 22:35 ` Julien Grall
17 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-05 13:14 UTC (permalink / raw)
To: xen-devel
Cc: Jan Beulich, Roger Pau Monné, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, Dario Faggioli,
Juergen Gross, George Dunlap, Bertrand Marquis, Volodymyr Babchuk,
Oleksii Kurochko, Community Manager
On Wed, Mar 5, 2025 at 11:11 AM Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> This is V1 series from Mirela Simonovic. Ported to 4.16 and with added changes
> suggested here
> https://lore.kernel.org/all/CAKPH-NjmaZENb8gT=+FobrAycRF01_--6GuRA2ck9Di5wiudhA@mail.gmail.com
>
> This is V2 series form Mykyta Poturai:
> https://marc.info/?l=xen-devel&m=166514782207736&w=2
>
> This series introduces support for suspend-to-RAM (referred to as "suspend"
> in the following text) for Xen on ARM64. The primary focus of this patch series
> is to add Xen system suspend support. Previous patch series raised many
> questions regarding VCPU context restoration, so the necessary changes will be
> addressed in the next part of this series.
> As part of these changes, all domain flags and related code (which affected
> common functions like vcpu_unblock) have been removed, keeping only the
> essential modifications for Xen suspend/resume. Suspend/resume is now fully
> supported only for the hardware domain. Proper support for domU suspend/resume
> will be added in the next part of this patch series.
>
> The implementation is aligned with the design specification that has been
> proposed on xen-devel list:
> https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html
>
> At a high-level the series contains:
> 1) Support for suspending guests via virtual PSCI SYSTEM_SUSPEND call
> 2) Support for resuming a guest on an interrupt targeted to that guest
> 3) Support for suspending Xen after dom0 finalizes the suspend
> 4) Support for resuming Xen on an interrupt that is targeted to a guest
>
> --------------------------------------------------------------------------------
> TODO:
0) Add suspend/resume handlers to IOMMU drivers (there aren’t any
problems with the current implementation because the domains used for
test are thin, and this patch series implements only the very basic
logic)
> 1) Add VCPU context reset/restore for non-hardware domains.
> 2) Implement xl suspend/resume for Arm (should it follow the x86 approach?).
> 3) Support suspend/resume for GICv3.
> 4) Add suspend support for Arm32
>
> --------------------------------------------------------------------------------
> In more details:
>
> *** About suspending/resuming guests
>
> The patches included in this series allow PSCI compliant guests that have
> support for suspend to RAM (e.g. echo mem > /sys/power/state in Linux) to
> suspend and resume on top of Xen without any EL1 code changes.
>
> During their suspend procedure guests will hot-unplug their secondary CPUs,
> triggering Xen's virtual CPU_OFF PSCI implementation, and then finalize the
> suspend from their boot CPU, triggering Xen's virtual SYSTEM_SUSPEND PSCI.
> Guests will save/restore their own EL1 context on suspend/resume.
>
> A guest is expected to leave enabled interrupts that are considered to be its
> wake-up sources. Those interrupts will be able to wake up the guest. This holds
> regardless of the state of the underlying software layers, i.e. whether Xen gets
> suspended or not doesn't affect the ability of the guest to wake up.
>
> First argument of SYSTEM_SUSPEND PSCI call is a resume entry point, from which
> the guest assumes to start on resume. On resume, guests assume to be running in
> an environment whose state matches the CPU state after reset, e.g. with reset
> register values, MMU disabled, etc. To ensure this, Xen has to 'reset' the
> VCPU context and save the resume entry point into program counter before the
> guest's VCPU gets scheduled in on resume. This is done when the guest resumes.
> Xen also needs to take care that the guest's view of GIC and timer gets saved.
> Also, while a guest is suspended its watchdogs are paused, in order to avoid
> watchdog triggered shutdown of a guest that has been asleep for a period of time
> that is longer than the watchdog period.
>
> After this point, from Xen's point of view a suspended guest has one VCPU
> blocked, waiting for an interrupt. When such an interrupt comes, Xen will
> unblock the VCPU of the suspended domain, which results in the guest
> resuming.
>
> *** About suspending/resuming Xen
>
> Xen starts its own suspend procedure once dom0 is suspended. Dom0 is
> considered to be the decision maker for EL1 and EL2.
> On suspend, Xen will first freeze all domains. Then, Xen disables physical
> secondary CPUs, which leads to physical CPU_OFF to be called by each secondary
> CPU. After that Xen finalizes the suspend from the boot CPU.
>
> This consists of suspending the timer, i.e. suppressing its interrupts (we don't
> want to be woken up by a timer, there is no VCPU ready to be scheduled). Then
> the state of GIC is saved, console is suspended, and CPU context is saved. The
> saved context tells where Xen needs to continue execution on resume.
> Since Xen will resume with MMU disabled, the first thing to do in resume is to
> resume memory management in order to be able to access the context that needs to
> be restored (we know virtual address of the context data). Finally Xen calls
> SYSTEM_SUSPEND PSCI to the EL3.
>
> When resuming, all the steps done in suspend need to be reverted. This is
> completed by unblocking dom0's VCPU, because we always want the dom0 to
> resume,
> regardless of the target domain whose interrupt woke up Xen.
>
> *** Handling of unprivileged guests during Xen suspend/resume
>
> Any domU that is not suspended when dom0 suspends will be frozen, domUs that are
> already suspended remain suspended. On resume the suspended domUs still remain
> suspended (unless their wake interrupt caused Xen to wake) while the
> others will be thawed.
>
> For more details please refer to patches or the design specification:
> https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html
>
> --------------------------------------------------------------------------------
> CHANGELOG
>
> In this cover letter and in the commit messages within the changelog section:
> - patch series V1 refers to https://marc.info/?l=xen-devel&m=154202231501850&w=2
> - patch series V2 refers to https://marc.info/?l=xen-devel&m=166514782207736&w=2
>
> Changes introduced in V3:
> - drop all domain flags and related code (which touched common functions like
> vcpu_unblock), keeping only the necessary changes for Xen suspend/resume,
> i.e. suspend/resume is now fully supported only for the hardware domain.
> Proper support for domU suspend/resume will be added in a future patch.
> This patch does not yet include VCPU context reset or domain context
> restoration in VCPU.
> - add checks before calling IOMMU suspend/resume. These functions may be
> unimplemented, so check that they exist before calling to prevent crashes
> - prevent disable_nonboot_cpus crash on ARM64. If we call disable_nonboot_cpus
> on ARM64 with system_state set to SYS_STATE_suspend, the ASSERT_ALLOC_CONTEXT
> assertion will be triggered
> - drop commit "timer: don't migrate timers during suspend" (see comment
> https://marc.info/?l=xen-devel&m=167036477229741&w=2). There is no freeze of
> dom0 on the latest master
> - drop the commit introduced in patch series V2: "xen: don't free percpu areas
> during suspend." This commit was ported from x86 code, but in the new master,
> the percpu CPU state change notification call chain has become common, so
> there is no reason to port this code. The remaining part, which does not
> belong to the aforementioned patch, "don't initialize percpu on resume," has
> been introduced in a new separate commit
> - introduce system suspend config option and covered code related to
> suspend/resume with it
> - implement suspend/resume calls for SCIF driver (it was needed for test
> purpose, code has been tested on R-Car H3 Starter Kit board)
> - the next commits had been appended with "xen/arm: Trigger Xen suspend when
> hardware domain completes suspend":
> -- xen/arm: Disable/enable non-boot physical CPUs on suspend/resume
> -- xen/arm: Add rcu_barrier() before enabling non-boot CPUs on resume
> -- xen/arm: Suspend/resume GIC on system suspend/resume
> -- xen/arm: Resume Dom0 after Xen resumes
> -- xen/arm: Suspend/resume console on Xen suspend/resume
> - add iommu suspend/resume calls to system suspend/resume
> - return PSCI_NOT_SUPPORTED instead of 1 in case when we call SYSTEM_SUSPEND
> on ARM32
> - add checking of PSCI version for SYSTEM_SUSPEND call
> - instead of using create_page_tables, enable_mmu, and mmu_init_secondary_cpu,
> the existing function enable_secondary_cpu_mm is now used
> - prepare_secondary_mm (previously init_secondary_pagetables in the previous
> patch series) is now called at the end of start_xen instead of
> setup_pagetables. Calling it in the previous location caused a crash
> - add early printk init during resume
>
>
> Changes introduced in V2:
> - drop commit "xen/arm: Move code that initializes VCPU context into a separate
> function" (see comment https://marc.info/?l=xen-devel&m=154202861704014&w=2)
> - introduce a separate struct for watchdog timers (see comment
> https://marc.info/?l=xen-devel&m=154203624106684&w=2)
> - don't initialize percpu on resume, it was a part of "xen: don't free percpu
> areas during suspend"
> - drop the call to watchdog_domain_resume from ctxt_switch_to; drop suspended
> field from timer structure introduced for watchdog timer in prev series
> - mov hyp_resume to head.S to place it near the rest of the start code
> - simplify the code in hyp_resume by using existing functions such as
> check_cpu_mode, cpu_init, create_page_tables, and enable_mmu
> - a lot of changes related to reseting/restoring VCPU context of suspended domU
>
> Mirela Simonovic (9):
> xen/x86: Move freeze/thaw_domains into common files
> xen/arm: introduce a separate struct for watchdog timers
> xen/arm: add suspend and resume timer helpers
> xen/arm: Implement GIC suspend/resume functions (gicv2 only)
> xen/arm: Implement PSCI system suspend
> xen/arm: Trigger Xen suspend when hardware domain completes suspend
> xen/arm: Implement PSCI SYSTEM_SUSPEND call (physical interface)
> xen/arm: Resume memory management on Xen resume
> xen/arm: Save/restore context on suspend/resume
>
> Mykola Kvach (6):
> xen/cpu: prevent disable_nonboot_cpus crash on ARM64
> xen/percpu: don't initialize percpu on resume
> xen/arm: Introduce system suspend config option
> xen/char: implement suspend/resume calls for SCIF driver
> xen/arm: add watchdog domain suspend/resume helpers
> CHANGELOG: Mention Xen suspend/resume to RAM feature on arm64
>
> Mykyta Poturai (1):
> iommu: Add checks before calling iommu suspend/resume
>
> CHANGELOG.md | 2 +
> xen/arch/arm/Kconfig | 11 +
> xen/arch/arm/Makefile | 1 +
> xen/arch/arm/arm64/head.S | 117 ++++++++++
> xen/arch/arm/gic-v2.c | 142 ++++++++++++
> xen/arch/arm/gic.c | 29 +++
> xen/arch/arm/include/asm/domain.h | 3 +
> xen/arch/arm/include/asm/gic.h | 12 +
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 3 +
> xen/arch/arm/include/asm/suspend.h | 41 ++++
> xen/arch/arm/include/asm/time.h | 5 +
> xen/arch/arm/psci.c | 19 ++
> xen/arch/arm/setup.c | 8 +
> xen/arch/arm/suspend.c | 320 ++++++++++++++++++++++++++
> xen/arch/arm/time.c | 26 +++
> xen/arch/arm/vpsci.c | 32 +++
> xen/arch/x86/acpi/power.c | 29 ---
> xen/common/cpu.c | 43 ++++
> xen/common/domain.c | 30 +++
> xen/common/keyhandler.c | 2 +-
> xen/common/percpu.c | 3 +-
> xen/common/sched/core.c | 50 +++-
> xen/drivers/char/scif-uart.c | 31 ++-
> xen/drivers/passthrough/iommu.c | 4 +-
> xen/include/xen/sched.h | 15 +-
> xen/include/xen/watchdog.h | 6 +
> 27 files changed, 945 insertions(+), 40 deletions(-)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/arch/arm/suspend.c
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 00/16] Suspend to RAM support for Xen on arm64
2025-03-05 9:11 [PATCH 00/16] Suspend to RAM support for Xen on arm64 Mykola Kvach
` (16 preceding siblings ...)
2025-03-05 13:14 ` [PATCH 00/16] Suspend to RAM support for Xen " Mykola Kvach
@ 2025-03-11 22:35 ` Julien Grall
2025-03-19 12:00 ` Mykola Kvach
17 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2025-03-11 22:35 UTC (permalink / raw)
To: Mykola Kvach, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Andrew Cooper, Anthony PERARD,
Michal Orzel, Stefano Stabellini, Dario Faggioli, Juergen Gross,
George Dunlap, Bertrand Marquis, Volodymyr Babchuk,
Oleksii Kurochko, Community Manager
Hi,
On 05/03/2025 09:11, Mykola Kvach wrote:
> This is V1 series from Mirela Simonovic. Ported to 4.16 and with added changes
> suggested here
> https://lore.kernel.org/all/CAKPH-NjmaZENb8gT=+FobrAycRF01_--6GuRA2ck9Di5wiudhA@mail.gmail.com
>
> This is V2 series form Mykyta Poturai:
> https://marc.info/?l=xen-devel&m=166514782207736&w=2
>
> This series introduces support for suspend-to-RAM (referred to as "suspend"
> in the following text) for Xen on ARM64. The primary focus of this patch series
> is to add Xen system suspend support. Previous patch series raised many
> questions regarding VCPU context restoration, so the necessary changes will be
> addressed in the next part of this series.
I can't exactly remember the details. But from what you wrote here...
> As part of these changes, all domain flags and related code (which affected
> common functions like vcpu_unblock) have been removed, keeping only the
> essential modifications for Xen suspend/resume. Suspend/resume is now fully
> supported only for the hardware domain.
... it is not clear hwo suspend/resume would work even for the hardware
domain. Can you clarify?
Proper support for domU suspend/resume
> will be added in the next part of this patch series.
>
> The implementation is aligned with the design specification that has been
> proposed on xen-devel list:
> https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html
>
> At a high-level the series contains:
> 1) Support for suspending guests via virtual PSCI SYSTEM_SUSPEND call
This is contradicting what you wrote above. If the call is not meant to
work for guests, then the call should be forbidden.
> 2) Support for resuming a guest on an interrupt targeted to that guest
> 3) Support for suspending Xen after dom0 finalizes the suspend
> 4) Support for resuming Xen on an interrupt that is targeted to a guest
>
> --------------------------------------------------------------------------------
> TODO:
> 1) Add VCPU context reset/restore for non-hardware domains.
> 2) Implement xl suspend/resume for Arm (should it follow the x86 approach?).
> 3) Support suspend/resume for GICv3.
> 4) Add suspend support for Arm32.
>
> --------------------------------------------------------------------------------
> In more details:
>
> *** About suspending/resuming guests
>
> The patches included in this series allow PSCI compliant guests that have
> support for suspend to RAM (e.g. echo mem > /sys/power/state in Linux) to
> suspend and resume on top of Xen without any EL1 code changes.
>
> During their suspend procedure guests will hot-unplug their secondary CPUs,
> triggering Xen's virtual CPU_OFF PSCI implementation, and then finalize the
> suspend from their boot CPU, triggering Xen's virtual SYSTEM_SUSPEND PSCI.
> Guests will save/restore their own EL1 context on suspend/resume.
>
> A guest is expected to leave enabled interrupts that are considered to be its
> wake-up sources. Those interrupts will be able to wake up the guest. This holds
> regardless of the state of the underlying software layers, i.e. whether Xen gets
> suspended or not doesn't affect the ability of the guest to wake up.
>
> First argument of SYSTEM_SUSPEND PSCI call is a resume entry point, from which
> the guest assumes to start on resume. On resume, guests assume to be running in
> an environment whose state matches the CPU state after reset, e.g. with reset
> register values, MMU disabled, etc. To ensure this, Xen has to 'reset' the
> VCPU context and save the resume entry point into program counter before the
> guest's VCPU gets scheduled in on resume. This is done when the guest resumes.
> Xen also needs to take care that the guest's view of GIC and timer gets saved.
> Also, while a guest is suspended its watchdogs are paused, in order to avoid
> watchdog triggered shutdown of a guest that has been asleep for a period of time
> that is longer than the watchdog period.
>
> After this point, from Xen's point of view a suspended guest has one VCPU
> blocked, waiting for an interrupt. When such an interrupt comes, Xen will
> unblock the VCPU of the suspended domain, which results in the guest
> resuming.
>
> *** About suspending/resuming Xen
>
> Xen starts its own suspend procedure once dom0 is suspended. Dom0 is
> considered to be the decision maker for EL1 and EL2.
> On suspend, Xen will first freeze all domains. Then, Xen disables physical
> secondary CPUs, which leads to physical CPU_OFF to be called by each secondary
> CPU. After that Xen finalizes the suspend from the boot CPU.
>
> This consists of suspending the timer, i.e. suppressing its interrupts (we don't
> want to be woken up by a timer, there is no VCPU ready to be scheduled). Then
> the state of GIC is saved, console is suspended, and CPU context is saved. The
> saved context tells where Xen needs to continue execution on resume.
> Since Xen will resume with MMU disabled, the first thing to do in resume is to
> resume memory management in order to be able to access the context that needs to
> be restored (we know virtual address of the context data). Finally Xen calls
> SYSTEM_SUSPEND PSCI to the EL3.
>
> When resuming, all the steps done in suspend need to be reverted. This is
> completed by unblocking dom0's VCPU, because we always want the dom0 to
> resume,
> regardless of the target domain whose interrupt woke up Xen.
>
> *** Handling of unprivileged guests during Xen suspend/resume
>
> Any domU that is not suspended when dom0 suspends will be frozen, domUs that are
> already suspended remain suspended. On resume the suspended domUs still remain
> suspended (unless their wake interrupt caused Xen to wake) while the
> others will be thawed.
>
> For more details please refer to patches or the design specification:
> https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html
>
> --------------------------------------------------------------------------------
> CHANGELOG
>
> In this cover letter and in the commit messages within the changelog section:
> - patch series V1 refers to https://marc.info/?l=xen-devel&m=154202231501850&w=2
> - patch series V2 refers to https://marc.info/?l=xen-devel&m=166514782207736&w=2
>
> Changes introduced in V3:
So this series is v3?
> Mirela Simonovic (9):
> xen/x86: Move freeze/thaw_domains into common files
> xen/arm: introduce a separate struct for watchdog timers
> xen/arm: add suspend and resume timer helpers
> xen/arm: Implement GIC suspend/resume functions (gicv2 only)
> xen/arm: Implement PSCI system suspend
> xen/arm: Trigger Xen suspend when hardware domain completes suspend
> xen/arm: Implement PSCI SYSTEM_SUSPEND call (physical interface)
> xen/arm: Resume memory management on Xen resume
> xen/arm: Save/restore context on suspend/resume
>
> Mykola Kvach (6):
> xen/cpu: prevent disable_nonboot_cpus crash on ARM64
> xen/percpu: don't initialize percpu on resume
> xen/arm: Introduce system suspend config option
> xen/char: implement suspend/resume calls for SCIF driver
> xen/arm: add watchdog domain suspend/resume helpers
> CHANGELOG: Mention Xen suspend/resume to RAM feature on arm64
>
> Mykyta Poturai (1):
> iommu: Add checks before calling iommu suspend/resume
This series is quite large and complex to review. I am wondering if it
would make sense to split in smaller chunk so it is quicker to
review/merge. One split I can think of is:
* disabling CPU (could be tested using the hotplug hypercall)
* guest suspend/resume (could be tested using xl suspend/resume)
* System suspend/resume
>
> CHANGELOG.md | 2 +
> xen/arch/arm/Kconfig | 11 +
> xen/arch/arm/Makefile | 1 +
> xen/arch/arm/arm64/head.S | 117 ++++++++++
> xen/arch/arm/gic-v2.c | 142 ++++++++++++
> xen/arch/arm/gic.c | 29 +++
> xen/arch/arm/include/asm/domain.h | 3 +
> xen/arch/arm/include/asm/gic.h | 12 +
> xen/arch/arm/include/asm/perfc_defn.h | 1 +
> xen/arch/arm/include/asm/psci.h | 3 +
> xen/arch/arm/include/asm/suspend.h | 41 ++++
> xen/arch/arm/include/asm/time.h | 5 +
> xen/arch/arm/psci.c | 19 ++
> xen/arch/arm/setup.c | 8 +
> xen/arch/arm/suspend.c | 320 ++++++++++++++++++++++++++
> xen/arch/arm/time.c | 26 +++
> xen/arch/arm/vpsci.c | 32 +++
> xen/arch/x86/acpi/power.c | 29 ---
> xen/common/cpu.c | 43 ++++
> xen/common/domain.c | 30 +++
> xen/common/keyhandler.c | 2 +-
> xen/common/percpu.c | 3 +-
> xen/common/sched/core.c | 50 +++-
> xen/drivers/char/scif-uart.c | 31 ++-
> xen/drivers/passthrough/iommu.c | 4 +-
> xen/include/xen/sched.h | 15 +-
> xen/include/xen/watchdog.h | 6 +
You also want to update SUPPORT.md for the two/three new features. They
probably want to be experimental until you fix everything mentioned in
the cover letter (aside maybe cpu off which can be tech preview).
> 27 files changed, 945 insertions(+), 40 deletions(-)
> create mode 100644 xen/arch/arm/include/asm/suspend.h
> create mode 100644 xen/arch/arm/suspend.c
>
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: [PATCH 00/16] Suspend to RAM support for Xen on arm64
2025-03-11 22:35 ` Julien Grall
@ 2025-03-19 12:00 ` Mykola Kvach
0 siblings, 0 replies; 69+ messages in thread
From: Mykola Kvach @ 2025-03-19 12:00 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, Jan Beulich, Roger Pau Monné, Andrew Cooper,
Anthony PERARD, Michal Orzel, Stefano Stabellini, Dario Faggioli,
Juergen Gross, George Dunlap, Bertrand Marquis, Volodymyr Babchuk,
Oleksii Kurochko, Community Manager
Hi,
On Wed, Mar 12, 2025 at 12:36 AM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 05/03/2025 09:11, Mykola Kvach wrote:
> > This is V1 series from Mirela Simonovic. Ported to 4.16 and with added changes
> > suggested here
> > https://lore.kernel.org/all/CAKPH-NjmaZENb8gT=+FobrAycRF01_--6GuRA2ck9Di5wiudhA@mail.gmail.com
> >
> > This is V2 series form Mykyta Poturai:
> > https://marc.info/?l=xen-devel&m=166514782207736&w=2
> >
> > This series introduces support for suspend-to-RAM (referred to as "suspend"
> > in the following text) for Xen on ARM64. The primary focus of this patch series
> > is to add Xen system suspend support. Previous patch series raised many
> > questions regarding VCPU context restoration, so the necessary changes will be
> > addressed in the next part of this series.
>
> I can't exactly remember the details. But from what you wrote here...
>
> > As part of these changes, all domain flags and related code (which affected
> > common functions like vcpu_unblock) have been removed, keeping only the
> > essential modifications for Xen suspend/resume. Suspend/resume is now fully
> > supported only for the hardware domain.
>
> ... it is not clear hwo suspend/resume would work even for the hardware
> domain. Can you clarify?
It can work on a simple setup with a thin Dom0 that has only basic
resources like CPUs,
a virtual console, GIC and a timer, without requiring additional hardware.
vcpu_unblock called directly during resuming the hardware domain.
>
> Proper support for domU suspend/resume
> > will be added in the next part of this patch series.
> >
> > The implementation is aligned with the design specification that has been
> > proposed on xen-devel list:
> > https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html
> >
> > At a high-level the series contains:
> > 1) Support for suspending guests via virtual PSCI SYSTEM_SUSPEND call
>
> This is contradicting what you wrote above. If the call is not meant to
> work for guests, then the call should be forbidden.
Oops! It seems like not all parts of the cover letter were updated
after the last changes
I used the previous version as a base. I'll fix it. Thank you for
pointing that out
>
> > 2) Support for resuming a guest on an interrupt targeted to that guest
> > 3) Support for suspending Xen after dom0 finalizes the suspend
> > 4) Support for resuming Xen on an interrupt that is targeted to a guest
> >
> > --------------------------------------------------------------------------------
> > TODO:
> > 1) Add VCPU context reset/restore for non-hardware domains.
> > 2) Implement xl suspend/resume for Arm (should it follow the x86 approach?).
> > 3) Support suspend/resume for GICv3.
> > 4) Add suspend support for Arm32.
> >
> > --------------------------------------------------------------------------------
> > In more details:
> >
> > *** About suspending/resuming guests
> >
> > The patches included in this series allow PSCI compliant guests that have
> > support for suspend to RAM (e.g. echo mem > /sys/power/state in Linux) to
> > suspend and resume on top of Xen without any EL1 code changes.
> >
> > During their suspend procedure guests will hot-unplug their secondary CPUs,
> > triggering Xen's virtual CPU_OFF PSCI implementation, and then finalize the
> > suspend from their boot CPU, triggering Xen's virtual SYSTEM_SUSPEND PSCI.
> > Guests will save/restore their own EL1 context on suspend/resume.
> >
> > A guest is expected to leave enabled interrupts that are considered to be its
> > wake-up sources. Those interrupts will be able to wake up the guest. This holds
> > regardless of the state of the underlying software layers, i.e. whether Xen gets
> > suspended or not doesn't affect the ability of the guest to wake up.
> >
> > First argument of SYSTEM_SUSPEND PSCI call is a resume entry point, from which
> > the guest assumes to start on resume. On resume, guests assume to be running in
> > an environment whose state matches the CPU state after reset, e.g. with reset
> > register values, MMU disabled, etc. To ensure this, Xen has to 'reset' the
> > VCPU context and save the resume entry point into program counter before the
> > guest's VCPU gets scheduled in on resume. This is done when the guest resumes.
> > Xen also needs to take care that the guest's view of GIC and timer gets saved.
> > Also, while a guest is suspended its watchdogs are paused, in order to avoid
> > watchdog triggered shutdown of a guest that has been asleep for a period of time
> > that is longer than the watchdog period.
> >
> > After this point, from Xen's point of view a suspended guest has one VCPU
> > blocked, waiting for an interrupt. When such an interrupt comes, Xen will
> > unblock the VCPU of the suspended domain, which results in the guest
> > resuming.
> >
> > *** About suspending/resuming Xen
> >
> > Xen starts its own suspend procedure once dom0 is suspended. Dom0 is
> > considered to be the decision maker for EL1 and EL2.
> > On suspend, Xen will first freeze all domains. Then, Xen disables physical
> > secondary CPUs, which leads to physical CPU_OFF to be called by each secondary
> > CPU. After that Xen finalizes the suspend from the boot CPU.
> >
> > This consists of suspending the timer, i.e. suppressing its interrupts (we don't
> > want to be woken up by a timer, there is no VCPU ready to be scheduled). Then
> > the state of GIC is saved, console is suspended, and CPU context is saved. The
> > saved context tells where Xen needs to continue execution on resume.
> > Since Xen will resume with MMU disabled, the first thing to do in resume is to
> > resume memory management in order to be able to access the context that needs to
> > be restored (we know virtual address of the context data). Finally Xen calls
> > SYSTEM_SUSPEND PSCI to the EL3.
> >
> > When resuming, all the steps done in suspend need to be reverted. This is
> > completed by unblocking dom0's VCPU, because we always want the dom0 to
> > resume,
> > regardless of the target domain whose interrupt woke up Xen.
> >
> > *** Handling of unprivileged guests during Xen suspend/resume
> >
> > Any domU that is not suspended when dom0 suspends will be frozen, domUs that are
> > already suspended remain suspended. On resume the suspended domUs still remain
> > suspended (unless their wake interrupt caused Xen to wake) while the
> > others will be thawed.
> >
> > For more details please refer to patches or the design specification:
> > https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html
> >
> > --------------------------------------------------------------------------------
> > CHANGELOG
> >
> > In this cover letter and in the commit messages within the changelog section:
> > - patch series V1 refers to https://marc.info/?l=xen-devel&m=154202231501850&w=2
> > - patch series V2 refers to https://marc.info/?l=xen-devel&m=166514782207736&w=2
> >
> > Changes introduced in V3:
>
> So this series is v3?
Yes, it's version 3, but I didn't add the version tag because years
passed between these
three versions, and the previous version of the patch series didn't
include the correct tag
either. If you'd like, I can add the correct version tag during the
next update of this patch
series.
>
> > Mirela Simonovic (9):
> > xen/x86: Move freeze/thaw_domains into common files
> > xen/arm: introduce a separate struct for watchdog timers
> > xen/arm: add suspend and resume timer helpers
> > xen/arm: Implement GIC suspend/resume functions (gicv2 only)
> > xen/arm: Implement PSCI system suspend
> > xen/arm: Trigger Xen suspend when hardware domain completes suspend
> > xen/arm: Implement PSCI SYSTEM_SUSPEND call (physical interface)
> > xen/arm: Resume memory management on Xen resume
> > xen/arm: Save/restore context on suspend/resume
> >
> > Mykola Kvach (6):
> > xen/cpu: prevent disable_nonboot_cpus crash on ARM64
> > xen/percpu: don't initialize percpu on resume
> > xen/arm: Introduce system suspend config option
> > xen/char: implement suspend/resume calls for SCIF driver
> > xen/arm: add watchdog domain suspend/resume helpers
> > CHANGELOG: Mention Xen suspend/resume to RAM feature on arm64
> >
> > Mykyta Poturai (1):
> > iommu: Add checks before calling iommu suspend/resume
>
> This series is quite large and complex to review. I am wondering if it
> would make sense to split in smaller chunk so it is quicker to
> review/merge. One split I can think of is:
>
> * disabling CPU (could be tested using the hotplug hypercall)
> * guest suspend/resume (could be tested using xl suspend/resume)
> * System suspend/resume
Okay, I'll split this patch series into a few parts
>
> >
> > CHANGELOG.md | 2 +
> > xen/arch/arm/Kconfig | 11 +
> > xen/arch/arm/Makefile | 1 +
> > xen/arch/arm/arm64/head.S | 117 ++++++++++
> > xen/arch/arm/gic-v2.c | 142 ++++++++++++
> > xen/arch/arm/gic.c | 29 +++
> > xen/arch/arm/include/asm/domain.h | 3 +
> > xen/arch/arm/include/asm/gic.h | 12 +
> > xen/arch/arm/include/asm/perfc_defn.h | 1 +
> > xen/arch/arm/include/asm/psci.h | 3 +
> > xen/arch/arm/include/asm/suspend.h | 41 ++++
> > xen/arch/arm/include/asm/time.h | 5 +
> > xen/arch/arm/psci.c | 19 ++
> > xen/arch/arm/setup.c | 8 +
> > xen/arch/arm/suspend.c | 320 ++++++++++++++++++++++++++
> > xen/arch/arm/time.c | 26 +++
> > xen/arch/arm/vpsci.c | 32 +++
> > xen/arch/x86/acpi/power.c | 29 ---
> > xen/common/cpu.c | 43 ++++
> > xen/common/domain.c | 30 +++
> > xen/common/keyhandler.c | 2 +-
> > xen/common/percpu.c | 3 +-
> > xen/common/sched/core.c | 50 +++-
> > xen/drivers/char/scif-uart.c | 31 ++-
> > xen/drivers/passthrough/iommu.c | 4 +-
> > xen/include/xen/sched.h | 15 +-
> > xen/include/xen/watchdog.h | 6 +
>
> You also want to update SUPPORT.md for the two/three new features. They
> probably want to be experimental until you fix everything mentioned in
> the cover letter (aside maybe cpu off which can be tech preview).
Got it, I'll make the necessary updates.
>
> > 27 files changed, 945 insertions(+), 40 deletions(-)
> > create mode 100644 xen/arch/arm/include/asm/suspend.h
> > create mode 100644 xen/arch/arm/suspend.c
> >
>
> Cheers,
>
> --
> Julien Grall
>
Best regards,
Mykola
^ permalink raw reply [flat|nested] 69+ messages in thread