linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] PM: disable nonboot cpus before suspending devices
@ 2010-01-22 18:00 Sebastian Ott
  2010-01-22 20:48 ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Ott @ 2010-01-22 18:00 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel

hi Rafael,

on s390 we have a reproduceable testcase where, after all devices were
suspended, a memory allocation results in disk IO. I know that this is
similar to the current discussion about magically changing the gpf mask,
but in our case the related allocation is triggered not by a device driver
but directly by hibernation_snapshot. The call chain looks like this:

STACK:
 0 schedule+1796 [0x5a7af0]
 1 io_schedule+98 [0x5a82ce]
 2 sync_page_killable+4 [0x1ec424]
 3 __wait_on_bit+204 [0x5a8bc4]
 4 add_to_page_cache_locked+2 [0x1ec766]
 5 shrink_page_list+2372 [0x1fc5b0]
 6 shrink_list+2496 [0x1fd02c]
 7 shrink_zone+932 [0x1fd3e0]
 8 try_to_free_pages+668 [0x1fe4bc]
 9 __alloc_pages_nodemask+1346 [0x1f5056]
10 __get_free_pages+76 [0x1f52dc]
11 __build_sched_domains+60 [0x144f98]
12 partition_sched_domains+696 [0x145dcc]
13 update_sched_domains+100 [0x146104]
14 notifier_call_chain+166 [0x5ae112]
15 raw_notifier_call_chain+44 [0x1800c4]
16 _cpu_down+586 [0x59f212]
17 disable_nonboot_cpus+354 [0x155ad2]
18 hibernation_snapshot+324 [0x1a7938]
19 hibernate+304 [0x1a7bcc]
20 state_store+130 [0x1a645e]
21 sysfs_write_file+264 [0x2b551c]
22 vfs_write+190 [0x23f98a]
23 sys_write+100 [0x23fb50]
24 sysc_noemu+16 [0x118ff6]

a possible fix would be to call disable_nonboot_cpus before suspending the
devices..
NOTE: this affects pm debug modes
---

Since disable_nonboot_cpus triggers memory allocations we should call
it before suspending the devices in hibernation_snapshot.

 Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
---
 kernel/power/hibernate.c |   17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -229,13 +229,9 @@ static int create_image(int platform_mod
 	}
 
 	error = platform_pre_snapshot(platform_mode);
-	if (error || hibernation_test(TEST_PLATFORM))
-		goto Platform_finish;
-
-	error = disable_nonboot_cpus();
-	if (error || hibernation_test(TEST_CPUS)
+	if (error || hibernation_test(TEST_PLATFORM)
 	    || hibernation_testmode(HIBERNATION_TEST))
-		goto Enable_cpus;
+		goto Platform_finish;
 
 	local_irq_disable();
 
@@ -269,9 +265,6 @@ static int create_image(int platform_mod
  Enable_irqs:
 	local_irq_enable();
 
- Enable_cpus:
-	enable_nonboot_cpus();
-
  Platform_finish:
 	platform_finish(platform_mode);
 
@@ -303,6 +296,10 @@ int hibernation_snapshot(int platform_mo
 	if (error)
 		goto Close;
 
+	error = disable_nonboot_cpus();
+	if (error || hibernation_test(TEST_CPUS))
+		goto Enable_cpus;
+
 	suspend_console();
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
@@ -322,6 +319,8 @@ int hibernation_snapshot(int platform_mo
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
 	resume_console();
+ Enable_cpus:
+	enable_nonboot_cpus();
  Close:
 	platform_end(platform_mode);
 	return error;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-01-22 18:00 [RFC][PATCH] PM: disable nonboot cpus before suspending devices Sebastian Ott
@ 2010-01-22 20:48 ` Rafael J. Wysocki
  2010-01-22 21:24   ` Rafael J. Wysocki
  2010-01-25 15:08   ` Sebastian Ott
  0 siblings, 2 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-01-22 20:48 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt

On Friday 22 January 2010, Sebastian Ott wrote:
> hi Rafael,
> 
> on s390 we have a reproduceable testcase where, after all devices were
> suspended, a memory allocation results in disk IO. I know that this is
> similar to the current discussion about magically changing the gpf mask,
> but in our case the related allocation is triggered not by a device driver
> but directly by hibernation_snapshot. The call chain looks like this:
> 
> STACK:
>  0 schedule+1796 [0x5a7af0]
>  1 io_schedule+98 [0x5a82ce]
>  2 sync_page_killable+4 [0x1ec424]
>  3 __wait_on_bit+204 [0x5a8bc4]
>  4 add_to_page_cache_locked+2 [0x1ec766]
>  5 shrink_page_list+2372 [0x1fc5b0]
>  6 shrink_list+2496 [0x1fd02c]
>  7 shrink_zone+932 [0x1fd3e0]
>  8 try_to_free_pages+668 [0x1fe4bc]
>  9 __alloc_pages_nodemask+1346 [0x1f5056]
> 10 __get_free_pages+76 [0x1f52dc]
> 11 __build_sched_domains+60 [0x144f98]
> 12 partition_sched_domains+696 [0x145dcc]
> 13 update_sched_domains+100 [0x146104]
> 14 notifier_call_chain+166 [0x5ae112]
> 15 raw_notifier_call_chain+44 [0x1800c4]
> 16 _cpu_down+586 [0x59f212]
> 17 disable_nonboot_cpus+354 [0x155ad2]
> 18 hibernation_snapshot+324 [0x1a7938]
> 19 hibernate+304 [0x1a7bcc]
> 20 state_store+130 [0x1a645e]
> 21 sysfs_write_file+264 [0x2b551c]
> 22 vfs_write+190 [0x23f98a]
> 23 sys_write+100 [0x23fb50]
> 24 sysc_noemu+16 [0x118ff6]
> 
> a possible fix would be to call disable_nonboot_cpus before suspending the
> devices..

This is going against the changes attempting to speed-up suspend and resume,
such as the asynchronous suspend/resume patchset, so I don't agree with it.

The real solution would be to remove the memory allocations from the
_cpu_down() call path.

BTW, this is one of the cases I and Ben are talking about where it's not
practical to rework the code just to avoid memory allocation problems during
suspend/resume.

Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-01-22 20:48 ` Rafael J. Wysocki
@ 2010-01-22 21:24   ` Rafael J. Wysocki
  2010-01-25 15:08   ` Sebastian Ott
  1 sibling, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-01-22 21:24 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt

On Friday 22 January 2010, Rafael J. Wysocki wrote:
> On Friday 22 January 2010, Sebastian Ott wrote:
> > hi Rafael,
> > 
> > on s390 we have a reproduceable testcase where, after all devices were
> > suspended, a memory allocation results in disk IO. I know that this is
> > similar to the current discussion about magically changing the gpf mask,
> > but in our case the related allocation is triggered not by a device driver
> > but directly by hibernation_snapshot. The call chain looks like this:
> > 
> > STACK:
> >  0 schedule+1796 [0x5a7af0]
> >  1 io_schedule+98 [0x5a82ce]
> >  2 sync_page_killable+4 [0x1ec424]
> >  3 __wait_on_bit+204 [0x5a8bc4]
> >  4 add_to_page_cache_locked+2 [0x1ec766]
> >  5 shrink_page_list+2372 [0x1fc5b0]
> >  6 shrink_list+2496 [0x1fd02c]
> >  7 shrink_zone+932 [0x1fd3e0]
> >  8 try_to_free_pages+668 [0x1fe4bc]
> >  9 __alloc_pages_nodemask+1346 [0x1f5056]
> > 10 __get_free_pages+76 [0x1f52dc]
> > 11 __build_sched_domains+60 [0x144f98]
> > 12 partition_sched_domains+696 [0x145dcc]
> > 13 update_sched_domains+100 [0x146104]
> > 14 notifier_call_chain+166 [0x5ae112]
> > 15 raw_notifier_call_chain+44 [0x1800c4]
> > 16 _cpu_down+586 [0x59f212]
> > 17 disable_nonboot_cpus+354 [0x155ad2]
> > 18 hibernation_snapshot+324 [0x1a7938]
> > 19 hibernate+304 [0x1a7bcc]
> > 20 state_store+130 [0x1a645e]
> > 21 sysfs_write_file+264 [0x2b551c]
> > 22 vfs_write+190 [0x23f98a]
> > 23 sys_write+100 [0x23fb50]
> > 24 sysc_noemu+16 [0x118ff6]
> > 
> > a possible fix would be to call disable_nonboot_cpus before suspending the
> > devices..
> 
> This is going against the changes attempting to speed-up suspend and resume,
> such as the asynchronous suspend/resume patchset, so I don't agree with it.

In fact there's more to it.

enable_nonboot_cpus() has to be called before suspend_ops->wake() (and
analogously for hibernation), because of some ACPI-related ordering constraints
(calling them in the reverse orther leads to _serious_ problems during resume).

In turn, suspend_ops->wake() should be called before we resume devices, for
similar reasons.

So, your patch really would break things.

Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-01-22 20:48 ` Rafael J. Wysocki
  2010-01-22 21:24   ` Rafael J. Wysocki
@ 2010-01-25 15:08   ` Sebastian Ott
  2010-01-25 21:37     ` Rafael J. Wysocki
  1 sibling, 1 reply; 17+ messages in thread
From: Sebastian Ott @ 2010-01-25 15:08 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt

Hi.

On Fri, 22 Jan 2010, Rafael J. Wysocki wrote:

> On Friday 22 January 2010, Sebastian Ott wrote:
> > 
> > a possible fix would be to call disable_nonboot_cpus before suspending the
> > devices..
> 
> This is going against the changes attempting to speed-up suspend and resume,
> such as the asynchronous suspend/resume patchset, so I don't agree with it.

Isn't the main benefit for this scenario that while a driver starts io and
waits for interrupts, the callback for the next device can be called? And
this can be done with one cpu as well.

> 
> The real solution would be to remove the memory allocations from the
> _cpu_down() call path.

So you have to also ban allocations from all registered notifiers at the
cpu_chain. And since enable_nonboot_cpus is called before the devices are
woken up, the same would be true for _cpu_up() which may not be done
easily.

> 
> BTW, this is one of the cases I and Ben are talking about where it's not
> practical to rework the code just to avoid memory allocation problems during
> suspend/resume.

Ok. All i'm saying is that in hibernation_snapshot/create_image memory
allocations are directely triggered after all devices were put to sleep /
before woken up - and this looks like a bug.

For the driver case - what about using your patch to not modify the gfp
mask but print a warning instead so that these drivers can be identified
and fixed.

Regards,
Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-01-25 15:08   ` Sebastian Ott
@ 2010-01-25 21:37     ` Rafael J. Wysocki
  2010-02-01 14:41       ` Sebastian Ott
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-01-25 21:37 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt

On Monday 25 January 2010, Sebastian Ott wrote:
> Hi.
> 
> On Fri, 22 Jan 2010, Rafael J. Wysocki wrote:
> 
> > On Friday 22 January 2010, Sebastian Ott wrote:
> > > 
> > > a possible fix would be to call disable_nonboot_cpus before suspending the
> > > devices..
> > 
> > This is going against the changes attempting to speed-up suspend and resume,
> > such as the asynchronous suspend/resume patchset, so I don't agree with it.
> 
> Isn't the main benefit for this scenario that while a driver starts io and
> waits for interrupts, the callback for the next device can be called? And
> this can be done with one cpu as well.

That's the basic idea, but the additional CPUs help quite a bit.

> > The real solution would be to remove the memory allocations from the
> > _cpu_down() call path.
> 
> So you have to also ban allocations from all registered notifiers at the
> cpu_chain. And since enable_nonboot_cpus is called before the devices are
> woken up, the same would be true for _cpu_up() which may not be done
> easily.

That's correct.  BTW, that's what the CPU_TASKS_FROZEN bit is for among other
things (perhaps it may be used to fix this particular issue).

> > BTW, this is one of the cases I and Ben are talking about where it's not
> > practical to rework the code just to avoid memory allocation problems during
> > suspend/resume.
> 
> Ok. All i'm saying is that in hibernation_snapshot/create_image memory
> allocations are directely triggered after all devices were put to sleep /
> before woken up - and this looks like a bug.

I agree, but that's because people don't remember that CPU hotplug is also
used for suspend/hibernation.  I don't know at the moment how much effort
it would take to fix all of these problems appropriately, but I _guess_ that
would be quite some work.  That, among other things, is why I sent the patch
to modify gfp_allowed_mask before suspending devices.

> For the driver case - what about using your patch to not modify the gfp
> mask but print a warning instead so that these drivers can be identified
> and fixed.

We'd get a lot of warnings and there are cases where we know they would
trigger (eg. ACPI internals).  So, I'd rather like to reduce the users' pain
(by changing gfp_allowed_mask) than add to it (by adding a warning that's
guaranteed to show up).

Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-01-25 21:37     ` Rafael J. Wysocki
@ 2010-02-01 14:41       ` Sebastian Ott
  2010-02-01 15:30         ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Ott @ 2010-02-01 14:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt

Hi Rafael,

since you didn't like the idea of calling the driver callbacks with just
one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
suspend/hibernation and resume" a try and i can confirm that this
fixes the issue on s390. Will this go in 2.6.33/stable?

greetings,
Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-01 14:41       ` Sebastian Ott
@ 2010-02-01 15:30         ` Rafael J. Wysocki
  2010-02-01 15:43           ` Maxim Levitsky
  2010-02-01 15:57           ` Andrew Morton
  0 siblings, 2 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-01 15:30 UTC (permalink / raw)
  To: Sebastian Ott, Andrew Morton
  Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt, KOSAKI Motohiro

On Monday 01 February 2010, Sebastian Ott wrote:
> Hi Rafael,
> 
> since you didn't like the idea of calling the driver callbacks with just
> one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> suspend/hibernation and resume" a try and i can confirm that this
> fixes the issue on s390.

Great, thanks for testing!

> Will this go in 2.6.33/stable?

That depends on Andrew, actually.

Andrew, what do you think of the patch at:
http://patchwork.kernel.org/patch/74740/mbox/ ?

It helps people and I don't see any major drawbacks of it.

Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-01 15:30         ` Rafael J. Wysocki
@ 2010-02-01 15:43           ` Maxim Levitsky
  2010-02-01 15:57           ` Andrew Morton
  1 sibling, 0 replies; 17+ messages in thread
From: Maxim Levitsky @ 2010-02-01 15:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sebastian Ott, Andrew Morton, linux-pm, linux-kernel,
	Benjamin Herrenschmidt, KOSAKI Motohiro

On Mon, 2010-02-01 at 16:30 +0100, Rafael J. Wysocki wrote: 
> On Monday 01 February 2010, Sebastian Ott wrote:
> > Hi Rafael,
> > 
> > since you didn't like the idea of calling the driver callbacks with just
> > one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> > suspend/hibernation and resume" a try and i can confirm that this
> > fixes the issue on s390.
> 
> Great, thanks for testing!
> 
> > Will this go in 2.6.33/stable?
> 
> That depends on Andrew, actually.
> 
> Andrew, what do you think of the patch at:
> http://patchwork.kernel.org/patch/74740/mbox/ ?
> 
> It helps people and I don't see any major drawbacks of it.

And it works too.
I will today continue the torture testing of hibernate mode, but so far
no hands.

Best regards,
Maxim Levitsky


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-01 15:30         ` Rafael J. Wysocki
  2010-02-01 15:43           ` Maxim Levitsky
@ 2010-02-01 15:57           ` Andrew Morton
  2010-02-03  1:44             ` Rafael J. Wysocki
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-01 15:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Mon, 1 Feb 2010 16:30:04 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Monday 01 February 2010, Sebastian Ott wrote:
> > Hi Rafael,
> > 
> > since you didn't like the idea of calling the driver callbacks with just
> > one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> > suspend/hibernation and resume" a try and i can confirm that this
> > fixes the issue on s390.
> 
> Great, thanks for testing!
> 
> > Will this go in 2.6.33/stable?
> 
> That depends on Andrew, actually.
> 
> Andrew, what do you think of the patch at:
> http://patchwork.kernel.org/patch/74740/mbox/ ?
> 
> It helps people and I don't see any major drawbacks of it.
> 

Seems sane.  A couple of minor things:

- the names mm_force_noio_allocations() and mm_allow_io_allocations()
  are a bit sucky.  Asymmetrical.

- the functions don't nest: if someone calls
  mm_force_noio_allocations() twice in succession then the kernel is
  all mucked up.  Why not:

gfp_t mm_set_gfp_mask(gfp_t mask)
{
	gfp_t ret = gfp_allowed_mask;

	gfp_allowed_mask = mask;
	return ret;
}

which is of course racy :)  Could add a local spinlock if really worried.

All your current callers can easily save the old value in a local.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-01 15:57           ` Andrew Morton
@ 2010-02-03  1:44             ` Rafael J. Wysocki
  2010-02-03  1:48               ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-03  1:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Monday 01 February 2010, Andrew Morton wrote:
> On Mon, 1 Feb 2010 16:30:04 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Monday 01 February 2010, Sebastian Ott wrote:
> > > Hi Rafael,
> > > 
> > > since you didn't like the idea of calling the driver callbacks with just
> > > one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> > > suspend/hibernation and resume" a try and i can confirm that this
> > > fixes the issue on s390.
> > 
> > Great, thanks for testing!
> > 
> > > Will this go in 2.6.33/stable?
> > 
> > That depends on Andrew, actually.
> > 
> > Andrew, what do you think of the patch at:
> > http://patchwork.kernel.org/patch/74740/mbox/ ?
> > 
> > It helps people and I don't see any major drawbacks of it.
> > 
> 
> Seems sane.  A couple of minor things:
> 
> - the names mm_force_noio_allocations() and mm_allow_io_allocations()
>   are a bit sucky.  Asymmetrical.

Yeah.  The lack of imagination.  Sigh.

> - the functions don't nest: if someone calls
>   mm_force_noio_allocations() twice in succession then the kernel is
>   all mucked up.  Why not:
> 
> gfp_t mm_set_gfp_mask(gfp_t mask)
> {
> 	gfp_t ret = gfp_allowed_mask;
> 
> 	gfp_allowed_mask = mask;
> 	return ret;
> }
> 
> which is of course racy :)  Could add a local spinlock if really worried.

I'm not sure how that helps.  I'd need to read gfp_allowed_mask to obtain the
new value anyway.

> All your current callers can easily save the old value in a local.

Indeed.

Well, does the appended one look better?

Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: MM / PM: Force GFP_NOIO during suspend/hibernation and resume (rev. 2)

There are quite a few GFP_KERNEL memory allocations made during
suspend/hibernation and resume that may cause the system to hang,
because the I/O operations they depend on cannot be completed due to
the underlying devices being suspended.

Avoid this problem by clearing the __GFP_IO and __GFP_FS bits in
gfp_allowed_mask before suspend/hibernation and restoring the
original values of these bits in gfp_allowed_mask durig the
subsequent resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
 include/linux/gfp.h      |    8 ++++++++
 kernel/power/hibernate.c |    9 +++++++++
 kernel/power/suspend.c   |    3 +++
 3 files changed, 20 insertions(+)

Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -323,6 +323,7 @@ static int create_image(int platform_mod
 int hibernation_snapshot(int platform_mode)
 {
 	int error;
+	gfp_t saved_mask;
 
 	error = platform_begin(platform_mode);
 	if (error)
@@ -334,6 +335,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -351,6 +353,7 @@ int hibernation_snapshot(int platform_mo
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -445,14 +448,17 @@ static int resume_target_kernel(bool pla
 int hibernation_restore(int platform_mode)
 {
 	int error;
+	gfp_t saved_mask;
 
 	pm_prepare_console();
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
 		dpm_resume_end(PMSG_RECOVER);
 	}
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 	pm_restore_console();
 	return error;
@@ -466,6 +472,7 @@ int hibernation_restore(int platform_mod
 int hibernation_platform_enter(void)
 {
 	int error;
+	gfp_t saved_mask;
 
 	if (!hibernation_ops)
 		return -ENOSYS;
@@ -481,6 +488,7 @@ int hibernation_platform_enter(void)
 
 	entering_platform_hibernation = true;
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_HIBERNATE);
 	if (error) {
 		if (hibernation_ops->recover)
@@ -518,6 +526,7 @@ int hibernation_platform_enter(void)
  Resume_devices:
 	entering_platform_hibernation = false;
 	dpm_resume_end(PMSG_RESTORE);
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 
  Close:
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -198,6 +198,7 @@ static int suspend_enter(suspend_state_t
 int suspend_devices_and_enter(suspend_state_t state)
 {
 	int error;
+	gfp_t saved_mask;
 
 	if (!suspend_ops)
 		return -ENOSYS;
@@ -208,6 +209,7 @@ int suspend_devices_and_enter(suspend_st
 			goto Close;
 	}
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	suspend_test_start();
 	error = dpm_suspend_start(PMSG_SUSPEND);
 	if (error) {
@@ -224,6 +226,7 @@ int suspend_devices_and_enter(suspend_st
 	suspend_test_start();
 	dpm_resume_end(PMSG_RESUME);
 	suspend_test_finish("resume devices");
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	if (suspend_ops->end)
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -83,6 +83,7 @@ struct vm_area_struct;
 #define GFP_HIGHUSER_MOVABLE	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
 				 __GFP_HARDWALL | __GFP_HIGHMEM | \
 				 __GFP_MOVABLE)
+#define GFP_IOFS	(__GFP_IO | __GFP_FS)
 
 #ifdef CONFIG_NUMA
 #define GFP_THISNODE	(__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
@@ -342,4 +343,11 @@ static inline void set_gfp_allowed_mask(
 	gfp_allowed_mask = mask;
 }
 
+static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
+{
+	gfp_t ret = gfp_allowed_mask;
+	gfp_allowed_mask &= ~mask;
+	return ret;
+}
+
 #endif /* __LINUX_GFP_H */

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-03  1:44             ` Rafael J. Wysocki
@ 2010-02-03  1:48               ` Andrew Morton
  2010-02-03 22:34                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-03  1:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> +{
> +	gfp_t ret = gfp_allowed_mask;
> +	gfp_allowed_mask &= ~mask;
> +	return ret;
> +}

Fair enuf.

Of course, this is all horridly racy/buggy without locking.  Would I be
correct in hoping that all the callers happen when the system is in
everyone-is-frozen mode?

Perhaps we should add some documentation (or even an assertion) to
prevent someone from using these interfaces from within normal code.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-03  1:48               ` Andrew Morton
@ 2010-02-03 22:34                 ` Rafael J. Wysocki
  2010-02-03 23:08                   ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-03 22:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Wednesday 03 February 2010, Andrew Morton wrote:
> On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > +{
> > +	gfp_t ret = gfp_allowed_mask;
> > +	gfp_allowed_mask &= ~mask;
> > +	return ret;
> > +}
> 
> Fair enuf.
> 
> Of course, this is all horridly racy/buggy without locking.  Would I be
> correct in hoping that all the callers happen when the system is in
> everyone-is-frozen mode?

As far as I can tell, gfp_allowed_mask is only touched during init apart from
this.

> Perhaps we should add some documentation (or even an assertion) to
> prevent someone from using these interfaces from within normal code.

I thought about that, but didn't invent anything smart enough.

Well, maybe except for a comment like "this must be called with pm_mutex held",
because that's the only case when it would be really safe.

Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-03 22:34                 ` Rafael J. Wysocki
@ 2010-02-03 23:08                   ` Andrew Morton
  2010-02-04  0:50                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-03 23:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Wed, 3 Feb 2010 23:34:37 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday 03 February 2010, Andrew Morton wrote:
> > On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > > +{
> > > +	gfp_t ret = gfp_allowed_mask;
> > > +	gfp_allowed_mask &= ~mask;
> > > +	return ret;
> > > +}
> > 
> > Fair enuf.
> > 
> > Of course, this is all horridly racy/buggy without locking.  Would I be
> > correct in hoping that all the callers happen when the system is in
> > everyone-is-frozen mode?
> 
> As far as I can tell, gfp_allowed_mask is only touched during init apart from
> this.

Well yes - the new interfaces are the problem - they're racy!

> > Perhaps we should add some documentation (or even an assertion) to
> > prevent someone from using these interfaces from within normal code.
> 
> I thought about that, but didn't invent anything smart enough.
> 
> Well, maybe except for a comment like "this must be called with pm_mutex held",
> because that's the only case when it would be really safe.

Is that the locking rule?  My above guess was incorrect?

Maybe slip a

	BUG_ON(!mutex_is_locked(&pm_mutex));

in there?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-03 23:08                   ` Andrew Morton
@ 2010-02-04  0:50                     ` Rafael J. Wysocki
  2010-02-04  1:21                       ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-04  0:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Thursday 04 February 2010, Andrew Morton wrote:
> On Wed, 3 Feb 2010 23:34:37 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday 03 February 2010, Andrew Morton wrote:
> > > On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > > > +{
> > > > +	gfp_t ret = gfp_allowed_mask;
> > > > +	gfp_allowed_mask &= ~mask;
> > > > +	return ret;
> > > > +}
> > > 
> > > Fair enuf.
> > > 
> > > Of course, this is all horridly racy/buggy without locking.  Would I be
> > > correct in hoping that all the callers happen when the system is in
> > > everyone-is-frozen mode?
> > 
> > As far as I can tell, gfp_allowed_mask is only touched during init apart from
> > this.
> 
> Well yes - the new interfaces are the problem - they're racy!
> 
> > > Perhaps we should add some documentation (or even an assertion) to
> > > prevent someone from using these interfaces from within normal code.
> > 
> > I thought about that, but didn't invent anything smart enough.
> > 
> > Well, maybe except for a comment like "this must be called with pm_mutex held",
> > because that's the only case when it would be really safe.
> 
> Is that the locking rule?  My above guess was incorrect?
> 
> Maybe slip a
> 
> 	BUG_ON(!mutex_is_locked(&pm_mutex));
> 
> in there?

That actually is a good idea, although I prefer WARN_ON (BUG_ON would be
rather blunt, so to speak, as it could kill setups not using suspend/hibernate
at all).

Updated patch follows.

Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: MM / PM: Force GFP_NOIO during suspend/hibernation and resume (rev. 3)

There are quite a few GFP_KERNEL memory allocations made during
suspend/hibernation and resume that may cause the system to hang,
because the I/O operations they depend on cannot be completed due to
the underlying devices being suspended.

Mitigate this problem by clearing the __GFP_IO and __GFP_FS bits in
gfp_allowed_mask before suspend/hibernation and restoring the
original values of these bits in gfp_allowed_mask durig the
subsequent resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
 include/linux/gfp.h      |    7 +++----
 init/main.c              |    2 +-
 kernel/power/hibernate.c |    9 +++++++++
 kernel/power/suspend.c   |    3 +++
 mm/page_alloc.c          |   24 ++++++++++++++++++++++++
 5 files changed, 40 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -83,6 +83,7 @@ struct vm_area_struct;
 #define GFP_HIGHUSER_MOVABLE	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
 				 __GFP_HARDWALL | __GFP_HIGHMEM | \
 				 __GFP_MOVABLE)
+#define GFP_IOFS	(__GFP_IO | __GFP_FS)
 
 #ifdef CONFIG_NUMA
 #define GFP_THISNODE	(__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
@@ -337,9 +338,7 @@ void drain_local_pages(void *dummy);
 
 extern gfp_t gfp_allowed_mask;
 
-static inline void set_gfp_allowed_mask(gfp_t mask)
-{
-	gfp_allowed_mask = mask;
-}
+extern void set_gfp_allowed_mask(gfp_t mask);
+extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
 
 #endif /* __LINUX_GFP_H */
Index: linux-2.6/init/main.c
===================================================================
--- linux-2.6.orig/init/main.c
+++ linux-2.6/init/main.c
@@ -601,7 +601,7 @@ asmlinkage void __init start_kernel(void
 	local_irq_enable();
 
 	/* Interrupts are enabled now so all GFP allocations are safe. */
-	set_gfp_allowed_mask(__GFP_BITS_MASK);
+	gfp_allowed_mask = __GFP_BITS_MASK;
 
 	kmem_cache_init_late();
 
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
 int percpu_pagelist_fraction;
 gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
 
+/*
+ * The following functions are used by the suspend/hibernate code to temporarily
+ * change gfp_allowed_mask in order to avoid using I/O during memory allocations
+ * while devices are suspended.  To avoid races with the suspend/hibernate code,
+ * they should always be called with pm_mutex held (gfp_allowed_mask also should
+ * only be modified with pm_mutex held, unless the suspend/hibernate code is
+ * guaranteed not to run in parallel with that modification).
+ */
+
+void set_gfp_allowed_mask(gfp_t mask)
+{
+	WARN_ON(!mutex_is_locked(&pm_mutex));
+	gfp_allowed_mask = mask;
+}
+
+gfp_t clear_gfp_allowed_mask(gfp_t mask)
+{
+	gfp_t ret = gfp_allowed_mask;
+
+	WARN_ON(!mutex_is_locked(&pm_mutex));
+	gfp_allowed_mask &= ~mask;
+	return ret;
+}
+
 #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
 int pageblock_order __read_mostly;
 #endif
Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -323,6 +323,7 @@ static int create_image(int platform_mod
 int hibernation_snapshot(int platform_mode)
 {
 	int error;
+	gfp_t saved_mask;
 
 	error = platform_begin(platform_mode);
 	if (error)
@@ -334,6 +335,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -351,6 +353,7 @@ int hibernation_snapshot(int platform_mo
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -445,14 +448,17 @@ static int resume_target_kernel(bool pla
 int hibernation_restore(int platform_mode)
 {
 	int error;
+	gfp_t saved_mask;
 
 	pm_prepare_console();
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
 		dpm_resume_end(PMSG_RECOVER);
 	}
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 	pm_restore_console();
 	return error;
@@ -466,6 +472,7 @@ int hibernation_restore(int platform_mod
 int hibernation_platform_enter(void)
 {
 	int error;
+	gfp_t saved_mask;
 
 	if (!hibernation_ops)
 		return -ENOSYS;
@@ -481,6 +488,7 @@ int hibernation_platform_enter(void)
 
 	entering_platform_hibernation = true;
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_HIBERNATE);
 	if (error) {
 		if (hibernation_ops->recover)
@@ -518,6 +526,7 @@ int hibernation_platform_enter(void)
  Resume_devices:
 	entering_platform_hibernation = false;
 	dpm_resume_end(PMSG_RESTORE);
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 
  Close:
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -198,6 +198,7 @@ static int suspend_enter(suspend_state_t
 int suspend_devices_and_enter(suspend_state_t state)
 {
 	int error;
+	gfp_t saved_mask;
 
 	if (!suspend_ops)
 		return -ENOSYS;
@@ -208,6 +209,7 @@ int suspend_devices_and_enter(suspend_st
 			goto Close;
 	}
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	suspend_test_start();
 	error = dpm_suspend_start(PMSG_SUSPEND);
 	if (error) {
@@ -224,6 +226,7 @@ int suspend_devices_and_enter(suspend_st
 	suspend_test_start();
 	dpm_resume_end(PMSG_RESUME);
 	suspend_test_finish("resume devices");
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	if (suspend_ops->end)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-04  0:50                     ` Rafael J. Wysocki
@ 2010-02-04  1:21                       ` Andrew Morton
  2010-02-04  1:41                         ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-04  1:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Thu, 4 Feb 2010 01:50:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
>  int percpu_pagelist_fraction;
>  gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
>  
> +/*
> + * The following functions are used by the suspend/hibernate code to temporarily
> + * change gfp_allowed_mask in order to avoid using I/O during memory allocations
> + * while devices are suspended.  To avoid races with the suspend/hibernate code,
> + * they should always be called with pm_mutex held (gfp_allowed_mask also should
> + * only be modified with pm_mutex held, unless the suspend/hibernate code is
> + * guaranteed not to run in parallel with that modification).
> + */
> +
> +void set_gfp_allowed_mask(gfp_t mask)
> +{
> +	WARN_ON(!mutex_is_locked(&pm_mutex));
> +	gfp_allowed_mask = mask;
> +}
> +
> +gfp_t clear_gfp_allowed_mask(gfp_t mask)
> +{
> +	gfp_t ret = gfp_allowed_mask;
> +
> +	WARN_ON(!mutex_is_locked(&pm_mutex));
> +	gfp_allowed_mask &= ~mask;
> +	return ret;
> +}

Maybe put an ifdef CONFIG_foo around these so they don't get included
when they're unneeded?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-04  1:21                       ` Andrew Morton
@ 2010-02-04  1:41                         ` Andrew Morton
  2010-02-04 19:33                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-04  1:41 UTC (permalink / raw)
  To: Rafael J. Wysocki, Sebastian Ott, linux-pm, linux-kernel,
	Benjamin Herrenschmidt, KOSAKI Motohiro

On Wed, 3 Feb 2010 17:21:47 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:

> On Thu, 4 Feb 2010 01:50:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
> >  int percpu_pagelist_fraction;
> >  gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
> >  
> > +/*
> > + * The following functions are used by the suspend/hibernate code to temporarily
> > + * change gfp_allowed_mask in order to avoid using I/O during memory allocations
> > + * while devices are suspended.  To avoid races with the suspend/hibernate code,
> > + * they should always be called with pm_mutex held (gfp_allowed_mask also should
> > + * only be modified with pm_mutex held, unless the suspend/hibernate code is
> > + * guaranteed not to run in parallel with that modification).
> > + */
> > +
> > +void set_gfp_allowed_mask(gfp_t mask)
> > +{
> > +	WARN_ON(!mutex_is_locked(&pm_mutex));
> > +	gfp_allowed_mask = mask;
> > +}
> > +
> > +gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > +{
> > +	gfp_t ret = gfp_allowed_mask;
> > +
> > +	WARN_ON(!mutex_is_locked(&pm_mutex));
> > +	gfp_allowed_mask &= ~mask;
> > +	return ret;
> > +}
> 
> Maybe put an ifdef CONFIG_foo around these so they don't get included
> when they're unneeded?
> 

I guess this:

mm/built-in.o: In function `clear_gfp_allowed_mask':
: undefined reference to `pm_mutex'
mm/built-in.o: In function `set_gfp_allowed_mask':
: undefined reference to `pm_mutex'

kinda answers my question.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
  2010-02-04  1:41                         ` Andrew Morton
@ 2010-02-04 19:33                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-04 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
	KOSAKI Motohiro

On Thursday 04 February 2010, Andrew Morton wrote:
> On Wed, 3 Feb 2010 17:21:47 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Thu, 4 Feb 2010 01:50:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
> > >  int percpu_pagelist_fraction;
> > >  gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
> > >  
> > > +/*
> > > + * The following functions are used by the suspend/hibernate code to temporarily
> > > + * change gfp_allowed_mask in order to avoid using I/O during memory allocations
> > > + * while devices are suspended.  To avoid races with the suspend/hibernate code,
> > > + * they should always be called with pm_mutex held (gfp_allowed_mask also should
> > > + * only be modified with pm_mutex held, unless the suspend/hibernate code is
> > > + * guaranteed not to run in parallel with that modification).
> > > + */
> > > +
> > > +void set_gfp_allowed_mask(gfp_t mask)
> > > +{
> > > +	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > +	gfp_allowed_mask = mask;
> > > +}
> > > +
> > > +gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > > +{
> > > +	gfp_t ret = gfp_allowed_mask;
> > > +
> > > +	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > +	gfp_allowed_mask &= ~mask;
> > > +	return ret;
> > > +}
> > 
> > Maybe put an ifdef CONFIG_foo around these so they don't get included
> > when they're unneeded?
> > 
> 
> I guess this:
> 
> mm/built-in.o: In function `clear_gfp_allowed_mask':
> : undefined reference to `pm_mutex'
> mm/built-in.o: In function `set_gfp_allowed_mask':
> : undefined reference to `pm_mutex'
> 
> kinda answers my question.

Sorry about that.  I tend to forget about test-compiling with CONFIG_PM unset.

The one below shouldn't have this issue.

Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: MM / PM: Force GFP_NOIO during suspend/hibernation and resume (rev. 3)

There are quite a few GFP_KERNEL memory allocations made during
suspend/hibernation and resume that may cause the system to hang,
because the I/O operations they depend on cannot be completed due to
the underlying devices being suspended.

Mitigate this problem by clearing the __GFP_IO and __GFP_FS bits in
gfp_allowed_mask before suspend/hibernation and restoring the
original values of these bits in gfp_allowed_mask durig the
subsequent resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
 include/linux/gfp.h      |    7 +++----
 init/main.c              |    2 +-
 kernel/power/hibernate.c |    9 +++++++++
 kernel/power/suspend.c   |    3 +++
 mm/page_alloc.c          |   26 ++++++++++++++++++++++++++
 5 files changed, 42 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -83,6 +83,7 @@ struct vm_area_struct;
 #define GFP_HIGHUSER_MOVABLE	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
 				 __GFP_HARDWALL | __GFP_HIGHMEM | \
 				 __GFP_MOVABLE)
+#define GFP_IOFS	(__GFP_IO | __GFP_FS)
 
 #ifdef CONFIG_NUMA
 #define GFP_THISNODE	(__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
@@ -337,9 +338,7 @@ void drain_local_pages(void *dummy);
 
 extern gfp_t gfp_allowed_mask;
 
-static inline void set_gfp_allowed_mask(gfp_t mask)
-{
-	gfp_allowed_mask = mask;
-}
+extern void set_gfp_allowed_mask(gfp_t mask);
+extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
 
 #endif /* __LINUX_GFP_H */
Index: linux-2.6/init/main.c
===================================================================
--- linux-2.6.orig/init/main.c
+++ linux-2.6/init/main.c
@@ -601,7 +601,7 @@ asmlinkage void __init start_kernel(void
 	local_irq_enable();
 
 	/* Interrupts are enabled now so all GFP allocations are safe. */
-	set_gfp_allowed_mask(__GFP_BITS_MASK);
+	gfp_allowed_mask = __GFP_BITS_MASK;
 
 	kmem_cache_init_late();
 
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -76,6 +76,32 @@ unsigned long totalreserve_pages __read_
 int percpu_pagelist_fraction;
 gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
 
+#ifdef CONFIG_PM_SLEEP
+/*
+ * The following functions are used by the suspend/hibernate code to temporarily
+ * change gfp_allowed_mask in order to avoid using I/O during memory allocations
+ * while devices are suspended.  To avoid races with the suspend/hibernate code,
+ * they should always be called with pm_mutex held (gfp_allowed_mask also should
+ * only be modified with pm_mutex held, unless the suspend/hibernate code is
+ * guaranteed not to run in parallel with that modification).
+ */
+
+void set_gfp_allowed_mask(gfp_t mask)
+{
+	WARN_ON(!mutex_is_locked(&pm_mutex));
+	gfp_allowed_mask = mask;
+}
+
+gfp_t clear_gfp_allowed_mask(gfp_t mask)
+{
+	gfp_t ret = gfp_allowed_mask;
+
+	WARN_ON(!mutex_is_locked(&pm_mutex));
+	gfp_allowed_mask &= ~mask;
+	return ret;
+}
+#endif /* CONFIG_PM_SLEEP */
+
 #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
 int pageblock_order __read_mostly;
 #endif
Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -323,6 +323,7 @@ static int create_image(int platform_mod
 int hibernation_snapshot(int platform_mode)
 {
 	int error;
+	gfp_t saved_mask;
 
 	error = platform_begin(platform_mode);
 	if (error)
@@ -334,6 +335,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -351,6 +353,7 @@ int hibernation_snapshot(int platform_mo
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -445,14 +448,17 @@ static int resume_target_kernel(bool pla
 int hibernation_restore(int platform_mode)
 {
 	int error;
+	gfp_t saved_mask;
 
 	pm_prepare_console();
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
 		dpm_resume_end(PMSG_RECOVER);
 	}
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 	pm_restore_console();
 	return error;
@@ -466,6 +472,7 @@ int hibernation_restore(int platform_mod
 int hibernation_platform_enter(void)
 {
 	int error;
+	gfp_t saved_mask;
 
 	if (!hibernation_ops)
 		return -ENOSYS;
@@ -481,6 +488,7 @@ int hibernation_platform_enter(void)
 
 	entering_platform_hibernation = true;
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_HIBERNATE);
 	if (error) {
 		if (hibernation_ops->recover)
@@ -518,6 +526,7 @@ int hibernation_platform_enter(void)
  Resume_devices:
 	entering_platform_hibernation = false;
 	dpm_resume_end(PMSG_RESTORE);
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 
  Close:
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -198,6 +198,7 @@ static int suspend_enter(suspend_state_t
 int suspend_devices_and_enter(suspend_state_t state)
 {
 	int error;
+	gfp_t saved_mask;
 
 	if (!suspend_ops)
 		return -ENOSYS;
@@ -208,6 +209,7 @@ int suspend_devices_and_enter(suspend_st
 			goto Close;
 	}
 	suspend_console();
+	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	suspend_test_start();
 	error = dpm_suspend_start(PMSG_SUSPEND);
 	if (error) {
@@ -224,6 +226,7 @@ int suspend_devices_and_enter(suspend_st
 	suspend_test_start();
 	dpm_resume_end(PMSG_RESUME);
 	suspend_test_finish("resume devices");
+	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	if (suspend_ops->end)

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-02-04 19:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-22 18:00 [RFC][PATCH] PM: disable nonboot cpus before suspending devices Sebastian Ott
2010-01-22 20:48 ` Rafael J. Wysocki
2010-01-22 21:24   ` Rafael J. Wysocki
2010-01-25 15:08   ` Sebastian Ott
2010-01-25 21:37     ` Rafael J. Wysocki
2010-02-01 14:41       ` Sebastian Ott
2010-02-01 15:30         ` Rafael J. Wysocki
2010-02-01 15:43           ` Maxim Levitsky
2010-02-01 15:57           ` Andrew Morton
2010-02-03  1:44             ` Rafael J. Wysocki
2010-02-03  1:48               ` Andrew Morton
2010-02-03 22:34                 ` Rafael J. Wysocki
2010-02-03 23:08                   ` Andrew Morton
2010-02-04  0:50                     ` Rafael J. Wysocki
2010-02-04  1:21                       ` Andrew Morton
2010-02-04  1:41                         ` Andrew Morton
2010-02-04 19:33                           ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).