linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PM / Sleep: fix recovery during s2ram/hibernation
@ 2014-10-24  7:43 Imre Deak
  2014-10-24  7:59 ` [PATCH v2] " Imre Deak
  0 siblings, 1 reply; 8+ messages in thread
From: Imre Deak @ 2014-10-24  7:43 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, linux-pm, linux-kernel

Atm, if one of the dev_pm_ops::freeze callbacks fails during the QUIESCE
phase we don't rollback things correctly calling the thaw and complete
callbacks. This could leave some devices in a suspended state in case of
an error during resuming from hibernation.

Also if an asynchronous suspend_late or freeze_late callback fails
during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
corresponding error correctly, in effect ignoring the error and
continuing the suspend-to-ram/hibernation. During suspend-to-ram this
could leave some devices without a valid saved context, leading to a
failure to reinitialize them during resume. During hibernation this
could leave some devices active interfeering with the creation /
restoration of the hibernation image. Also this could leave the
corresponding devices without a valid saved context and failure to
reinitialize them during resume.

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/base/power/main.c | 2 ++
 kernel/power/hibernate.c  | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 4497319..9717d5f 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
 	}
 	mutex_unlock(&dpm_list_mtx);
 	async_synchronize_full();
+	if (!error)
+		error = async_error;
 	if (error) {
 		suspend_stats.failed_suspend_late++;
 		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a9dfa79..05768d5 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -502,8 +502,10 @@ int hibernation_restore(int platform_mode)
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
-		dpm_resume_end(PMSG_RECOVER);
+		BUG_ON(!error);
 	}
+	if (error)
+		dpm_resume_end(PMSG_RECOVER);
 	pm_restore_gfp_mask();
 	resume_console();
 	pm_restore_console();
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2] PM / Sleep: fix recovery during s2ram/hibernation
  2014-10-24  7:43 [PATCH] PM / Sleep: fix recovery during s2ram/hibernation Imre Deak
@ 2014-10-24  7:59 ` Imre Deak
  2014-10-24 14:04   ` Rafael J. Wysocki
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Imre Deak @ 2014-10-24  7:59 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, linux-pm, linux-kernel

Atm, if one of the dev_pm_ops::freeze callbacks fails during the QUIESCE
phase we don't rollback things correctly calling the thaw and complete
callbacks. This could leave some devices in a suspended state in case of
an error during resuming from hibernation.

Also if an asynchronous suspend_late or freeze_late callback fails
during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
corresponding error correctly, in effect ignoring the error and
continuing the suspend-to-ram/hibernation. During suspend-to-ram this
could leave some devices without a valid saved context, leading to a
failure to reinitialize them during resume. During hibernation this
could leave some devices active interfeering with the creation /
restoration of the hibernation image. Also this could leave the
corresponding devices without a valid saved context and failure to
reinitialize them during resume.

v2:
- call dpm_resume_end() unconditionally, it's guaranteed that error
  is non-zero

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/base/power/main.c | 2 ++
 kernel/power/hibernate.c  | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 4497319..9717d5f 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
 	}
 	mutex_unlock(&dpm_list_mtx);
 	async_synchronize_full();
+	if (!error)
+		error = async_error;
 	if (error) {
 		suspend_stats.failed_suspend_late++;
 		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a9dfa79..594ee7f 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -502,8 +502,9 @@ int hibernation_restore(int platform_mode)
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
-		dpm_resume_end(PMSG_RECOVER);
+		BUG_ON(!error);
 	}
+	dpm_resume_end(PMSG_RECOVER);
 	pm_restore_gfp_mask();
 	resume_console();
 	pm_restore_console();
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] PM / Sleep: fix recovery during s2ram/hibernation
  2014-10-24  7:59 ` [PATCH v2] " Imre Deak
@ 2014-10-24 14:04   ` Rafael J. Wysocki
  2014-10-24 14:17     ` Imre Deak
  2014-10-24 17:29   ` [PATCH v3 1/2] PM / Sleep: fix async suspend_late/freeze_late error handling Imre Deak
  2014-10-24 17:29   ` [PATCH v3 2/2] PM / Sleep: fix recovery during resuming from hibernation Imre Deak
  2 siblings, 1 reply; 8+ messages in thread
From: Rafael J. Wysocki @ 2014-10-24 14:04 UTC (permalink / raw)
  To: Imre Deak; +Cc: Pavel Machek, linux-pm, linux-kernel

On Friday, October 24, 2014 10:59:09 AM Imre Deak wrote:
> Atm, if one of the dev_pm_ops::freeze callbacks fails during the QUIESCE
> phase we don't rollback things correctly calling the thaw and complete
> callbacks. This could leave some devices in a suspended state in case of
> an error during resuming from hibernation.
> 
> Also if an asynchronous suspend_late or freeze_late callback fails
> during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
> corresponding error correctly, in effect ignoring the error and
> continuing the suspend-to-ram/hibernation. During suspend-to-ram this
> could leave some devices without a valid saved context, leading to a
> failure to reinitialize them during resume. During hibernation this
> could leave some devices active interfeering with the creation /
> restoration of the hibernation image. Also this could leave the
> corresponding devices without a valid saved context and failure to
> reinitialize them during resume.
> 
> v2:
> - call dpm_resume_end() unconditionally, it's guaranteed that error
>   is non-zero
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>

These are two unrelated fixes, so please send them as two patches.

> ---
>  drivers/base/power/main.c | 2 ++
>  kernel/power/hibernate.c  | 3 ++-
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index 4497319..9717d5f 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
>  	}
>  	mutex_unlock(&dpm_list_mtx);
>  	async_synchronize_full();
> +	if (!error)
> +		error = async_error;
>  	if (error) {
>  		suspend_stats.failed_suspend_late++;
>  		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);

So the above will be the first fix and the below will be the second one.

> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index a9dfa79..594ee7f 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -502,8 +502,9 @@ int hibernation_restore(int platform_mode)
>  	error = dpm_suspend_start(PMSG_QUIESCE);
>  	if (!error) {
>  		error = resume_target_kernel(platform_mode);
> -		dpm_resume_end(PMSG_RECOVER);
> +		BUG_ON(!error);

Why BUG_ON()?  Is crashing the kernel necessary here?

>  	}
> +	dpm_resume_end(PMSG_RECOVER);
>  	pm_restore_gfp_mask();
>  	resume_console();
>  	pm_restore_console();
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] PM / Sleep: fix recovery during s2ram/hibernation
  2014-10-24 14:04   ` Rafael J. Wysocki
@ 2014-10-24 14:17     ` Imre Deak
  2014-10-24 14:54       ` Rafael J. Wysocki
  0 siblings, 1 reply; 8+ messages in thread
From: Imre Deak @ 2014-10-24 14:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Pavel Machek, linux-pm, linux-kernel

On Fri, 2014-10-24 at 16:04 +0200, Rafael J. Wysocki wrote:
> On Friday, October 24, 2014 10:59:09 AM Imre Deak wrote:
> > Atm, if one of the dev_pm_ops::freeze callbacks fails during the QUIESCE
> > phase we don't rollback things correctly calling the thaw and complete
> > callbacks. This could leave some devices in a suspended state in case of
> > an error during resuming from hibernation.
> > 
> > Also if an asynchronous suspend_late or freeze_late callback fails
> > during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
> > corresponding error correctly, in effect ignoring the error and
> > continuing the suspend-to-ram/hibernation. During suspend-to-ram this
> > could leave some devices without a valid saved context, leading to a
> > failure to reinitialize them during resume. During hibernation this
> > could leave some devices active interfeering with the creation /
> > restoration of the hibernation image. Also this could leave the
> > corresponding devices without a valid saved context and failure to
> > reinitialize them during resume.
> > 
> > v2:
> > - call dpm_resume_end() unconditionally, it's guaranteed that error
> >   is non-zero
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> 
> These are two unrelated fixes, so please send them as two patches.
> 
> > ---
> >  drivers/base/power/main.c | 2 ++
> >  kernel/power/hibernate.c  | 3 ++-
> >  2 files changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > index 4497319..9717d5f 100644
> > --- a/drivers/base/power/main.c
> > +++ b/drivers/base/power/main.c
> > @@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
> >  	}
> >  	mutex_unlock(&dpm_list_mtx);
> >  	async_synchronize_full();
> > +	if (!error)
> > +		error = async_error;
> >  	if (error) {
> >  		suspend_stats.failed_suspend_late++;
> >  		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);
> 
> So the above will be the first fix and the below will be the second one.

Ok.

> 
> > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> > index a9dfa79..594ee7f 100644
> > --- a/kernel/power/hibernate.c
> > +++ b/kernel/power/hibernate.c
> > @@ -502,8 +502,9 @@ int hibernation_restore(int platform_mode)
> >  	error = dpm_suspend_start(PMSG_QUIESCE);
> >  	if (!error) {
> >  		error = resume_target_kernel(platform_mode);
> > -		dpm_resume_end(PMSG_RECOVER);
> > +		BUG_ON(!error);
> 
> Why BUG_ON()?  Is crashing the kernel necessary here?

I figured that this being an undefined state after restoration of a
memory image, it's better to crash than to continue and risk corrupting
some user data.

> 
> >  	}
> > +	dpm_resume_end(PMSG_RECOVER);
> >  	pm_restore_gfp_mask();
> >  	resume_console();
> >  	pm_restore_console();
> > 
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] PM / Sleep: fix recovery during s2ram/hibernation
  2014-10-24 14:17     ` Imre Deak
@ 2014-10-24 14:54       ` Rafael J. Wysocki
  0 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2014-10-24 14:54 UTC (permalink / raw)
  To: imre.deak; +Cc: Pavel Machek, linux-pm, linux-kernel

On Friday, October 24, 2014 05:17:12 PM Imre Deak wrote:
> On Fri, 2014-10-24 at 16:04 +0200, Rafael J. Wysocki wrote:
> > On Friday, October 24, 2014 10:59:09 AM Imre Deak wrote:
> > > Atm, if one of the dev_pm_ops::freeze callbacks fails during the QUIESCE
> > > phase we don't rollback things correctly calling the thaw and complete
> > > callbacks. This could leave some devices in a suspended state in case of
> > > an error during resuming from hibernation.
> > > 
> > > Also if an asynchronous suspend_late or freeze_late callback fails
> > > during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
> > > corresponding error correctly, in effect ignoring the error and
> > > continuing the suspend-to-ram/hibernation. During suspend-to-ram this
> > > could leave some devices without a valid saved context, leading to a
> > > failure to reinitialize them during resume. During hibernation this
> > > could leave some devices active interfeering with the creation /
> > > restoration of the hibernation image. Also this could leave the
> > > corresponding devices without a valid saved context and failure to
> > > reinitialize them during resume.
> > > 
> > > v2:
> > > - call dpm_resume_end() unconditionally, it's guaranteed that error
> > >   is non-zero
> > > 
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > 
> > These are two unrelated fixes, so please send them as two patches.
> > 
> > > ---
> > >  drivers/base/power/main.c | 2 ++
> > >  kernel/power/hibernate.c  | 3 ++-
> > >  2 files changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > index 4497319..9717d5f 100644
> > > --- a/drivers/base/power/main.c
> > > +++ b/drivers/base/power/main.c
> > > @@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
> > >  	}
> > >  	mutex_unlock(&dpm_list_mtx);
> > >  	async_synchronize_full();
> > > +	if (!error)
> > > +		error = async_error;
> > >  	if (error) {
> > >  		suspend_stats.failed_suspend_late++;
> > >  		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);
> > 
> > So the above will be the first fix and the below will be the second one.
> 
> Ok.
> 
> > 
> > > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> > > index a9dfa79..594ee7f 100644
> > > --- a/kernel/power/hibernate.c
> > > +++ b/kernel/power/hibernate.c
> > > @@ -502,8 +502,9 @@ int hibernation_restore(int platform_mode)
> > >  	error = dpm_suspend_start(PMSG_QUIESCE);
> > >  	if (!error) {
> > >  		error = resume_target_kernel(platform_mode);
> > > -		dpm_resume_end(PMSG_RECOVER);
> > > +		BUG_ON(!error);
> > 
> > Why BUG_ON()?  Is crashing the kernel necessary here?
> 
> I figured that this being an undefined state after restoration of a
> memory image, it's better to crash than to continue and risk corrupting
> some user data.

So can you please add a comment explaining that new BUG_ON() along with it?

> 
> > 
> > >  	}
> > > +	dpm_resume_end(PMSG_RECOVER);
> > >  	pm_restore_gfp_mask();
> > >  	resume_console();
> > >  	pm_restore_console();
> > > 
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] PM / Sleep: fix async suspend_late/freeze_late error handling
  2014-10-24  7:59 ` [PATCH v2] " Imre Deak
  2014-10-24 14:04   ` Rafael J. Wysocki
@ 2014-10-24 17:29   ` Imre Deak
  2014-10-27 22:45     ` Rafael J. Wysocki
  2014-10-24 17:29   ` [PATCH v3 2/2] PM / Sleep: fix recovery during resuming from hibernation Imre Deak
  2 siblings, 1 reply; 8+ messages in thread
From: Imre Deak @ 2014-10-24 17:29 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, linux-pm, linux-kernel

If an asynchronous suspend_late or freeze_late callback fails
during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
corresponding error correctly, in effect ignoring the error and
continuing the suspend-to-ram/hibernation. During suspend-to-ram this
could leave some devices without a valid saved context, leading to a
failure to reinitialize them during resume. During hibernation this
could leave some devices active interfeering with the creation /
restoration of the hibernation image. Also this could leave the
corresponding devices without a valid saved context and failure to
reinitialize them during resume.

v3 (added to patchset):
- split out this fix into a separate patch (Rafael)

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/base/power/main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 4497319..9717d5f 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
 	}
 	mutex_unlock(&dpm_list_mtx);
 	async_synchronize_full();
+	if (!error)
+		error = async_error;
 	if (error) {
 		suspend_stats.failed_suspend_late++;
 		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] PM / Sleep: fix recovery during resuming from hibernation
  2014-10-24  7:59 ` [PATCH v2] " Imre Deak
  2014-10-24 14:04   ` Rafael J. Wysocki
  2014-10-24 17:29   ` [PATCH v3 1/2] PM / Sleep: fix async suspend_late/freeze_late error handling Imre Deak
@ 2014-10-24 17:29   ` Imre Deak
  2 siblings, 0 replies; 8+ messages in thread
From: Imre Deak @ 2014-10-24 17:29 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, linux-pm, linux-kernel

If a device's dev_pm_ops::freeze callback fails during the QUIESCE
phase we don't rollback things correctly calling the thaw and complete
callbacks. This could leave some devices in a suspended state in case of
an error during resuming from hibernation.

v2:
- call dpm_resume_end() unconditionally, it's guaranteed that error
  is non-zero
v3:
- split out this fix into a separate patch (Rafael)
- add code comment on why BUG_ON() is used (Rafael)

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 kernel/power/hibernate.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a9dfa79..1f35a34 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -502,8 +502,14 @@ int hibernation_restore(int platform_mode)
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
-		dpm_resume_end(PMSG_RECOVER);
+		/*
+		 * The above should either succeed and jump to the new kernel,
+		 * or return with an error. Otherwise things are just
+		 * undefined, so let's be paranoid.
+		 */
+		BUG_ON(!error);
 	}
+	dpm_resume_end(PMSG_RECOVER);
 	pm_restore_gfp_mask();
 	resume_console();
 	pm_restore_console();
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] PM / Sleep: fix async suspend_late/freeze_late error handling
  2014-10-24 17:29   ` [PATCH v3 1/2] PM / Sleep: fix async suspend_late/freeze_late error handling Imre Deak
@ 2014-10-27 22:45     ` Rafael J. Wysocki
  0 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2014-10-27 22:45 UTC (permalink / raw)
  To: Imre Deak; +Cc: Pavel Machek, linux-pm, linux-kernel

On Friday, October 24, 2014 08:29:09 PM Imre Deak wrote:
> If an asynchronous suspend_late or freeze_late callback fails
> during the SUSPEND, FREEZE or QUIESCE phases we don't propagate the
> corresponding error correctly, in effect ignoring the error and
> continuing the suspend-to-ram/hibernation. During suspend-to-ram this
> could leave some devices without a valid saved context, leading to a
> failure to reinitialize them during resume. During hibernation this
> could leave some devices active interfeering with the creation /
> restoration of the hibernation image. Also this could leave the
> corresponding devices without a valid saved context and failure to
> reinitialize them during resume.
> 
> v3 (added to patchset):
> - split out this fix into a separate patch (Rafael)
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Both [1-2/2] applied, thanks!

> ---
>  drivers/base/power/main.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index 4497319..9717d5f 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1266,6 +1266,8 @@ int dpm_suspend_late(pm_message_t state)
>  	}
>  	mutex_unlock(&dpm_list_mtx);
>  	async_synchronize_full();
> +	if (!error)
> +		error = async_error;
>  	if (error) {
>  		suspend_stats.failed_suspend_late++;
>  		dpm_save_failed_step(SUSPEND_SUSPEND_LATE);
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-27 22:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-24  7:43 [PATCH] PM / Sleep: fix recovery during s2ram/hibernation Imre Deak
2014-10-24  7:59 ` [PATCH v2] " Imre Deak
2014-10-24 14:04   ` Rafael J. Wysocki
2014-10-24 14:17     ` Imre Deak
2014-10-24 14:54       ` Rafael J. Wysocki
2014-10-24 17:29   ` [PATCH v3 1/2] PM / Sleep: fix async suspend_late/freeze_late error handling Imre Deak
2014-10-27 22:45     ` Rafael J. Wysocki
2014-10-24 17:29   ` [PATCH v3 2/2] PM / Sleep: fix recovery during resuming from hibernation Imre Deak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).