* [RFC][PATCH] PM: disable nonboot cpus before suspending devices
@ 2010-01-22 18:00 Sebastian Ott
2010-01-22 20:48 ` Rafael J. Wysocki
0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Ott @ 2010-01-22 18:00 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel
hi Rafael,
on s390 we have a reproduceable testcase where, after all devices were
suspended, a memory allocation results in disk IO. I know that this is
similar to the current discussion about magically changing the gpf mask,
but in our case the related allocation is triggered not by a device driver
but directly by hibernation_snapshot. The call chain looks like this:
STACK:
0 schedule+1796 [0x5a7af0]
1 io_schedule+98 [0x5a82ce]
2 sync_page_killable+4 [0x1ec424]
3 __wait_on_bit+204 [0x5a8bc4]
4 add_to_page_cache_locked+2 [0x1ec766]
5 shrink_page_list+2372 [0x1fc5b0]
6 shrink_list+2496 [0x1fd02c]
7 shrink_zone+932 [0x1fd3e0]
8 try_to_free_pages+668 [0x1fe4bc]
9 __alloc_pages_nodemask+1346 [0x1f5056]
10 __get_free_pages+76 [0x1f52dc]
11 __build_sched_domains+60 [0x144f98]
12 partition_sched_domains+696 [0x145dcc]
13 update_sched_domains+100 [0x146104]
14 notifier_call_chain+166 [0x5ae112]
15 raw_notifier_call_chain+44 [0x1800c4]
16 _cpu_down+586 [0x59f212]
17 disable_nonboot_cpus+354 [0x155ad2]
18 hibernation_snapshot+324 [0x1a7938]
19 hibernate+304 [0x1a7bcc]
20 state_store+130 [0x1a645e]
21 sysfs_write_file+264 [0x2b551c]
22 vfs_write+190 [0x23f98a]
23 sys_write+100 [0x23fb50]
24 sysc_noemu+16 [0x118ff6]
a possible fix would be to call disable_nonboot_cpus before suspending the
devices..
NOTE: this affects pm debug modes
---
Since disable_nonboot_cpus triggers memory allocations we should call
it before suspending the devices in hibernation_snapshot.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
---
kernel/power/hibernate.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -229,13 +229,9 @@ static int create_image(int platform_mod
}
error = platform_pre_snapshot(platform_mode);
- if (error || hibernation_test(TEST_PLATFORM))
- goto Platform_finish;
-
- error = disable_nonboot_cpus();
- if (error || hibernation_test(TEST_CPUS)
+ if (error || hibernation_test(TEST_PLATFORM)
|| hibernation_testmode(HIBERNATION_TEST))
- goto Enable_cpus;
+ goto Platform_finish;
local_irq_disable();
@@ -269,9 +265,6 @@ static int create_image(int platform_mod
Enable_irqs:
local_irq_enable();
- Enable_cpus:
- enable_nonboot_cpus();
-
Platform_finish:
platform_finish(platform_mode);
@@ -303,6 +296,10 @@ int hibernation_snapshot(int platform_mo
if (error)
goto Close;
+ error = disable_nonboot_cpus();
+ if (error || hibernation_test(TEST_CPUS))
+ goto Enable_cpus;
+
suspend_console();
error = dpm_suspend_start(PMSG_FREEZE);
if (error)
@@ -322,6 +319,8 @@ int hibernation_snapshot(int platform_mo
dpm_resume_end(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
resume_console();
+ Enable_cpus:
+ enable_nonboot_cpus();
Close:
platform_end(platform_mode);
return error;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-01-22 18:00 [RFC][PATCH] PM: disable nonboot cpus before suspending devices Sebastian Ott
@ 2010-01-22 20:48 ` Rafael J. Wysocki
2010-01-22 21:24 ` Rafael J. Wysocki
2010-01-25 15:08 ` Sebastian Ott
0 siblings, 2 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-01-22 20:48 UTC (permalink / raw)
To: Sebastian Ott; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt
On Friday 22 January 2010, Sebastian Ott wrote:
> hi Rafael,
>
> on s390 we have a reproduceable testcase where, after all devices were
> suspended, a memory allocation results in disk IO. I know that this is
> similar to the current discussion about magically changing the gpf mask,
> but in our case the related allocation is triggered not by a device driver
> but directly by hibernation_snapshot. The call chain looks like this:
>
> STACK:
> 0 schedule+1796 [0x5a7af0]
> 1 io_schedule+98 [0x5a82ce]
> 2 sync_page_killable+4 [0x1ec424]
> 3 __wait_on_bit+204 [0x5a8bc4]
> 4 add_to_page_cache_locked+2 [0x1ec766]
> 5 shrink_page_list+2372 [0x1fc5b0]
> 6 shrink_list+2496 [0x1fd02c]
> 7 shrink_zone+932 [0x1fd3e0]
> 8 try_to_free_pages+668 [0x1fe4bc]
> 9 __alloc_pages_nodemask+1346 [0x1f5056]
> 10 __get_free_pages+76 [0x1f52dc]
> 11 __build_sched_domains+60 [0x144f98]
> 12 partition_sched_domains+696 [0x145dcc]
> 13 update_sched_domains+100 [0x146104]
> 14 notifier_call_chain+166 [0x5ae112]
> 15 raw_notifier_call_chain+44 [0x1800c4]
> 16 _cpu_down+586 [0x59f212]
> 17 disable_nonboot_cpus+354 [0x155ad2]
> 18 hibernation_snapshot+324 [0x1a7938]
> 19 hibernate+304 [0x1a7bcc]
> 20 state_store+130 [0x1a645e]
> 21 sysfs_write_file+264 [0x2b551c]
> 22 vfs_write+190 [0x23f98a]
> 23 sys_write+100 [0x23fb50]
> 24 sysc_noemu+16 [0x118ff6]
>
> a possible fix would be to call disable_nonboot_cpus before suspending the
> devices..
This is going against the changes attempting to speed-up suspend and resume,
such as the asynchronous suspend/resume patchset, so I don't agree with it.
The real solution would be to remove the memory allocations from the
_cpu_down() call path.
BTW, this is one of the cases I and Ben are talking about where it's not
practical to rework the code just to avoid memory allocation problems during
suspend/resume.
Rafael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-01-22 20:48 ` Rafael J. Wysocki
@ 2010-01-22 21:24 ` Rafael J. Wysocki
2010-01-25 15:08 ` Sebastian Ott
1 sibling, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-01-22 21:24 UTC (permalink / raw)
To: Sebastian Ott; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt
On Friday 22 January 2010, Rafael J. Wysocki wrote:
> On Friday 22 January 2010, Sebastian Ott wrote:
> > hi Rafael,
> >
> > on s390 we have a reproduceable testcase where, after all devices were
> > suspended, a memory allocation results in disk IO. I know that this is
> > similar to the current discussion about magically changing the gpf mask,
> > but in our case the related allocation is triggered not by a device driver
> > but directly by hibernation_snapshot. The call chain looks like this:
> >
> > STACK:
> > 0 schedule+1796 [0x5a7af0]
> > 1 io_schedule+98 [0x5a82ce]
> > 2 sync_page_killable+4 [0x1ec424]
> > 3 __wait_on_bit+204 [0x5a8bc4]
> > 4 add_to_page_cache_locked+2 [0x1ec766]
> > 5 shrink_page_list+2372 [0x1fc5b0]
> > 6 shrink_list+2496 [0x1fd02c]
> > 7 shrink_zone+932 [0x1fd3e0]
> > 8 try_to_free_pages+668 [0x1fe4bc]
> > 9 __alloc_pages_nodemask+1346 [0x1f5056]
> > 10 __get_free_pages+76 [0x1f52dc]
> > 11 __build_sched_domains+60 [0x144f98]
> > 12 partition_sched_domains+696 [0x145dcc]
> > 13 update_sched_domains+100 [0x146104]
> > 14 notifier_call_chain+166 [0x5ae112]
> > 15 raw_notifier_call_chain+44 [0x1800c4]
> > 16 _cpu_down+586 [0x59f212]
> > 17 disable_nonboot_cpus+354 [0x155ad2]
> > 18 hibernation_snapshot+324 [0x1a7938]
> > 19 hibernate+304 [0x1a7bcc]
> > 20 state_store+130 [0x1a645e]
> > 21 sysfs_write_file+264 [0x2b551c]
> > 22 vfs_write+190 [0x23f98a]
> > 23 sys_write+100 [0x23fb50]
> > 24 sysc_noemu+16 [0x118ff6]
> >
> > a possible fix would be to call disable_nonboot_cpus before suspending the
> > devices..
>
> This is going against the changes attempting to speed-up suspend and resume,
> such as the asynchronous suspend/resume patchset, so I don't agree with it.
In fact there's more to it.
enable_nonboot_cpus() has to be called before suspend_ops->wake() (and
analogously for hibernation), because of some ACPI-related ordering constraints
(calling them in the reverse orther leads to _serious_ problems during resume).
In turn, suspend_ops->wake() should be called before we resume devices, for
similar reasons.
So, your patch really would break things.
Rafael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-01-22 20:48 ` Rafael J. Wysocki
2010-01-22 21:24 ` Rafael J. Wysocki
@ 2010-01-25 15:08 ` Sebastian Ott
2010-01-25 21:37 ` Rafael J. Wysocki
1 sibling, 1 reply; 17+ messages in thread
From: Sebastian Ott @ 2010-01-25 15:08 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt
Hi.
On Fri, 22 Jan 2010, Rafael J. Wysocki wrote:
> On Friday 22 January 2010, Sebastian Ott wrote:
> >
> > a possible fix would be to call disable_nonboot_cpus before suspending the
> > devices..
>
> This is going against the changes attempting to speed-up suspend and resume,
> such as the asynchronous suspend/resume patchset, so I don't agree with it.
Isn't the main benefit for this scenario that while a driver starts io and
waits for interrupts, the callback for the next device can be called? And
this can be done with one cpu as well.
>
> The real solution would be to remove the memory allocations from the
> _cpu_down() call path.
So you have to also ban allocations from all registered notifiers at the
cpu_chain. And since enable_nonboot_cpus is called before the devices are
woken up, the same would be true for _cpu_up() which may not be done
easily.
>
> BTW, this is one of the cases I and Ben are talking about where it's not
> practical to rework the code just to avoid memory allocation problems during
> suspend/resume.
Ok. All i'm saying is that in hibernation_snapshot/create_image memory
allocations are directely triggered after all devices were put to sleep /
before woken up - and this looks like a bug.
For the driver case - what about using your patch to not modify the gfp
mask but print a warning instead so that these drivers can be identified
and fixed.
Regards,
Sebastian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-01-25 15:08 ` Sebastian Ott
@ 2010-01-25 21:37 ` Rafael J. Wysocki
2010-02-01 14:41 ` Sebastian Ott
0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-01-25 21:37 UTC (permalink / raw)
To: Sebastian Ott; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt
On Monday 25 January 2010, Sebastian Ott wrote:
> Hi.
>
> On Fri, 22 Jan 2010, Rafael J. Wysocki wrote:
>
> > On Friday 22 January 2010, Sebastian Ott wrote:
> > >
> > > a possible fix would be to call disable_nonboot_cpus before suspending the
> > > devices..
> >
> > This is going against the changes attempting to speed-up suspend and resume,
> > such as the asynchronous suspend/resume patchset, so I don't agree with it.
>
> Isn't the main benefit for this scenario that while a driver starts io and
> waits for interrupts, the callback for the next device can be called? And
> this can be done with one cpu as well.
That's the basic idea, but the additional CPUs help quite a bit.
> > The real solution would be to remove the memory allocations from the
> > _cpu_down() call path.
>
> So you have to also ban allocations from all registered notifiers at the
> cpu_chain. And since enable_nonboot_cpus is called before the devices are
> woken up, the same would be true for _cpu_up() which may not be done
> easily.
That's correct. BTW, that's what the CPU_TASKS_FROZEN bit is for among other
things (perhaps it may be used to fix this particular issue).
> > BTW, this is one of the cases I and Ben are talking about where it's not
> > practical to rework the code just to avoid memory allocation problems during
> > suspend/resume.
>
> Ok. All i'm saying is that in hibernation_snapshot/create_image memory
> allocations are directely triggered after all devices were put to sleep /
> before woken up - and this looks like a bug.
I agree, but that's because people don't remember that CPU hotplug is also
used for suspend/hibernation. I don't know at the moment how much effort
it would take to fix all of these problems appropriately, but I _guess_ that
would be quite some work. That, among other things, is why I sent the patch
to modify gfp_allowed_mask before suspending devices.
> For the driver case - what about using your patch to not modify the gfp
> mask but print a warning instead so that these drivers can be identified
> and fixed.
We'd get a lot of warnings and there are cases where we know they would
trigger (eg. ACPI internals). So, I'd rather like to reduce the users' pain
(by changing gfp_allowed_mask) than add to it (by adding a warning that's
guaranteed to show up).
Rafael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-01-25 21:37 ` Rafael J. Wysocki
@ 2010-02-01 14:41 ` Sebastian Ott
2010-02-01 15:30 ` Rafael J. Wysocki
0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Ott @ 2010-02-01 14:41 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt
Hi Rafael,
since you didn't like the idea of calling the driver callbacks with just
one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
suspend/hibernation and resume" a try and i can confirm that this
fixes the issue on s390. Will this go in 2.6.33/stable?
greetings,
Sebastian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-01 14:41 ` Sebastian Ott
@ 2010-02-01 15:30 ` Rafael J. Wysocki
2010-02-01 15:43 ` Maxim Levitsky
2010-02-01 15:57 ` Andrew Morton
0 siblings, 2 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-01 15:30 UTC (permalink / raw)
To: Sebastian Ott, Andrew Morton
Cc: linux-pm, linux-kernel, Benjamin Herrenschmidt, KOSAKI Motohiro
On Monday 01 February 2010, Sebastian Ott wrote:
> Hi Rafael,
>
> since you didn't like the idea of calling the driver callbacks with just
> one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> suspend/hibernation and resume" a try and i can confirm that this
> fixes the issue on s390.
Great, thanks for testing!
> Will this go in 2.6.33/stable?
That depends on Andrew, actually.
Andrew, what do you think of the patch at:
http://patchwork.kernel.org/patch/74740/mbox/ ?
It helps people and I don't see any major drawbacks of it.
Rafael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-01 15:30 ` Rafael J. Wysocki
@ 2010-02-01 15:43 ` Maxim Levitsky
2010-02-01 15:57 ` Andrew Morton
1 sibling, 0 replies; 17+ messages in thread
From: Maxim Levitsky @ 2010-02-01 15:43 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Sebastian Ott, Andrew Morton, linux-pm, linux-kernel,
Benjamin Herrenschmidt, KOSAKI Motohiro
On Mon, 2010-02-01 at 16:30 +0100, Rafael J. Wysocki wrote:
> On Monday 01 February 2010, Sebastian Ott wrote:
> > Hi Rafael,
> >
> > since you didn't like the idea of calling the driver callbacks with just
> > one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> > suspend/hibernation and resume" a try and i can confirm that this
> > fixes the issue on s390.
>
> Great, thanks for testing!
>
> > Will this go in 2.6.33/stable?
>
> That depends on Andrew, actually.
>
> Andrew, what do you think of the patch at:
> http://patchwork.kernel.org/patch/74740/mbox/ ?
>
> It helps people and I don't see any major drawbacks of it.
And it works too.
I will today continue the torture testing of hibernate mode, but so far
no hands.
Best regards,
Maxim Levitsky
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-01 15:30 ` Rafael J. Wysocki
2010-02-01 15:43 ` Maxim Levitsky
@ 2010-02-01 15:57 ` Andrew Morton
2010-02-03 1:44 ` Rafael J. Wysocki
1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-01 15:57 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Mon, 1 Feb 2010 16:30:04 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> On Monday 01 February 2010, Sebastian Ott wrote:
> > Hi Rafael,
> >
> > since you didn't like the idea of calling the driver callbacks with just
> > one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> > suspend/hibernation and resume" a try and i can confirm that this
> > fixes the issue on s390.
>
> Great, thanks for testing!
>
> > Will this go in 2.6.33/stable?
>
> That depends on Andrew, actually.
>
> Andrew, what do you think of the patch at:
> http://patchwork.kernel.org/patch/74740/mbox/ ?
>
> It helps people and I don't see any major drawbacks of it.
>
Seems sane. A couple of minor things:
- the names mm_force_noio_allocations() and mm_allow_io_allocations()
are a bit sucky. Asymmetrical.
- the functions don't nest: if someone calls
mm_force_noio_allocations() twice in succession then the kernel is
all mucked up. Why not:
gfp_t mm_set_gfp_mask(gfp_t mask)
{
gfp_t ret = gfp_allowed_mask;
gfp_allowed_mask = mask;
return ret;
}
which is of course racy :) Could add a local spinlock if really worried.
All your current callers can easily save the old value in a local.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-01 15:57 ` Andrew Morton
@ 2010-02-03 1:44 ` Rafael J. Wysocki
2010-02-03 1:48 ` Andrew Morton
0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-03 1:44 UTC (permalink / raw)
To: Andrew Morton
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Monday 01 February 2010, Andrew Morton wrote:
> On Mon, 1 Feb 2010 16:30:04 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > On Monday 01 February 2010, Sebastian Ott wrote:
> > > Hi Rafael,
> > >
> > > since you didn't like the idea of calling the driver callbacks with just
> > > one cpu enabled, we gave your patch: "MM / PM: Force GFP_NOIO during
> > > suspend/hibernation and resume" a try and i can confirm that this
> > > fixes the issue on s390.
> >
> > Great, thanks for testing!
> >
> > > Will this go in 2.6.33/stable?
> >
> > That depends on Andrew, actually.
> >
> > Andrew, what do you think of the patch at:
> > http://patchwork.kernel.org/patch/74740/mbox/ ?
> >
> > It helps people and I don't see any major drawbacks of it.
> >
>
> Seems sane. A couple of minor things:
>
> - the names mm_force_noio_allocations() and mm_allow_io_allocations()
> are a bit sucky. Asymmetrical.
Yeah. The lack of imagination. Sigh.
> - the functions don't nest: if someone calls
> mm_force_noio_allocations() twice in succession then the kernel is
> all mucked up. Why not:
>
> gfp_t mm_set_gfp_mask(gfp_t mask)
> {
> gfp_t ret = gfp_allowed_mask;
>
> gfp_allowed_mask = mask;
> return ret;
> }
>
> which is of course racy :) Could add a local spinlock if really worried.
I'm not sure how that helps. I'd need to read gfp_allowed_mask to obtain the
new value anyway.
> All your current callers can easily save the old value in a local.
Indeed.
Well, does the appended one look better?
Rafael
---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: MM / PM: Force GFP_NOIO during suspend/hibernation and resume (rev. 2)
There are quite a few GFP_KERNEL memory allocations made during
suspend/hibernation and resume that may cause the system to hang,
because the I/O operations they depend on cannot be completed due to
the underlying devices being suspended.
Avoid this problem by clearing the __GFP_IO and __GFP_FS bits in
gfp_allowed_mask before suspend/hibernation and restoring the
original values of these bits in gfp_allowed_mask durig the
subsequent resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
include/linux/gfp.h | 8 ++++++++
kernel/power/hibernate.c | 9 +++++++++
kernel/power/suspend.c | 3 +++
3 files changed, 20 insertions(+)
Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -323,6 +323,7 @@ static int create_image(int platform_mod
int hibernation_snapshot(int platform_mode)
{
int error;
+ gfp_t saved_mask;
error = platform_begin(platform_mode);
if (error)
@@ -334,6 +335,7 @@ int hibernation_snapshot(int platform_mo
goto Close;
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_FREEZE);
if (error)
goto Recover_platform;
@@ -351,6 +353,7 @@ int hibernation_snapshot(int platform_mo
dpm_resume_end(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
platform_end(platform_mode);
@@ -445,14 +448,17 @@ static int resume_target_kernel(bool pla
int hibernation_restore(int platform_mode)
{
int error;
+ gfp_t saved_mask;
pm_prepare_console();
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_QUIESCE);
if (!error) {
error = resume_target_kernel(platform_mode);
dpm_resume_end(PMSG_RECOVER);
}
+ set_gfp_allowed_mask(saved_mask);
resume_console();
pm_restore_console();
return error;
@@ -466,6 +472,7 @@ int hibernation_restore(int platform_mod
int hibernation_platform_enter(void)
{
int error;
+ gfp_t saved_mask;
if (!hibernation_ops)
return -ENOSYS;
@@ -481,6 +488,7 @@ int hibernation_platform_enter(void)
entering_platform_hibernation = true;
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_HIBERNATE);
if (error) {
if (hibernation_ops->recover)
@@ -518,6 +526,7 @@ int hibernation_platform_enter(void)
Resume_devices:
entering_platform_hibernation = false;
dpm_resume_end(PMSG_RESTORE);
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -198,6 +198,7 @@ static int suspend_enter(suspend_state_t
int suspend_devices_and_enter(suspend_state_t state)
{
int error;
+ gfp_t saved_mask;
if (!suspend_ops)
return -ENOSYS;
@@ -208,6 +209,7 @@ int suspend_devices_and_enter(suspend_st
goto Close;
}
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
suspend_test_start();
error = dpm_suspend_start(PMSG_SUSPEND);
if (error) {
@@ -224,6 +226,7 @@ int suspend_devices_and_enter(suspend_st
suspend_test_start();
dpm_resume_end(PMSG_RESUME);
suspend_test_finish("resume devices");
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
if (suspend_ops->end)
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -83,6 +83,7 @@ struct vm_area_struct;
#define GFP_HIGHUSER_MOVABLE (__GFP_WAIT | __GFP_IO | __GFP_FS | \
__GFP_HARDWALL | __GFP_HIGHMEM | \
__GFP_MOVABLE)
+#define GFP_IOFS (__GFP_IO | __GFP_FS)
#ifdef CONFIG_NUMA
#define GFP_THISNODE (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
@@ -342,4 +343,11 @@ static inline void set_gfp_allowed_mask(
gfp_allowed_mask = mask;
}
+static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
+{
+ gfp_t ret = gfp_allowed_mask;
+ gfp_allowed_mask &= ~mask;
+ return ret;
+}
+
#endif /* __LINUX_GFP_H */
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-03 1:44 ` Rafael J. Wysocki
@ 2010-02-03 1:48 ` Andrew Morton
2010-02-03 22:34 ` Rafael J. Wysocki
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-03 1:48 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> +{
> + gfp_t ret = gfp_allowed_mask;
> + gfp_allowed_mask &= ~mask;
> + return ret;
> +}
Fair enuf.
Of course, this is all horridly racy/buggy without locking. Would I be
correct in hoping that all the callers happen when the system is in
everyone-is-frozen mode?
Perhaps we should add some documentation (or even an assertion) to
prevent someone from using these interfaces from within normal code.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-03 1:48 ` Andrew Morton
@ 2010-02-03 22:34 ` Rafael J. Wysocki
2010-02-03 23:08 ` Andrew Morton
0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-03 22:34 UTC (permalink / raw)
To: Andrew Morton
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Wednesday 03 February 2010, Andrew Morton wrote:
> On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > +{
> > + gfp_t ret = gfp_allowed_mask;
> > + gfp_allowed_mask &= ~mask;
> > + return ret;
> > +}
>
> Fair enuf.
>
> Of course, this is all horridly racy/buggy without locking. Would I be
> correct in hoping that all the callers happen when the system is in
> everyone-is-frozen mode?
As far as I can tell, gfp_allowed_mask is only touched during init apart from
this.
> Perhaps we should add some documentation (or even an assertion) to
> prevent someone from using these interfaces from within normal code.
I thought about that, but didn't invent anything smart enough.
Well, maybe except for a comment like "this must be called with pm_mutex held",
because that's the only case when it would be really safe.
Rafael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-03 22:34 ` Rafael J. Wysocki
@ 2010-02-03 23:08 ` Andrew Morton
2010-02-04 0:50 ` Rafael J. Wysocki
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-03 23:08 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Wed, 3 Feb 2010 23:34:37 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> On Wednesday 03 February 2010, Andrew Morton wrote:
> > On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >
> > > +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > > +{
> > > + gfp_t ret = gfp_allowed_mask;
> > > + gfp_allowed_mask &= ~mask;
> > > + return ret;
> > > +}
> >
> > Fair enuf.
> >
> > Of course, this is all horridly racy/buggy without locking. Would I be
> > correct in hoping that all the callers happen when the system is in
> > everyone-is-frozen mode?
>
> As far as I can tell, gfp_allowed_mask is only touched during init apart from
> this.
Well yes - the new interfaces are the problem - they're racy!
> > Perhaps we should add some documentation (or even an assertion) to
> > prevent someone from using these interfaces from within normal code.
>
> I thought about that, but didn't invent anything smart enough.
>
> Well, maybe except for a comment like "this must be called with pm_mutex held",
> because that's the only case when it would be really safe.
Is that the locking rule? My above guess was incorrect?
Maybe slip a
BUG_ON(!mutex_is_locked(&pm_mutex));
in there?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-03 23:08 ` Andrew Morton
@ 2010-02-04 0:50 ` Rafael J. Wysocki
2010-02-04 1:21 ` Andrew Morton
0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-04 0:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Thursday 04 February 2010, Andrew Morton wrote:
> On Wed, 3 Feb 2010 23:34:37 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > On Wednesday 03 February 2010, Andrew Morton wrote:
> > > On Wed, 3 Feb 2010 02:44:23 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > >
> > > > +static inline gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > > > +{
> > > > + gfp_t ret = gfp_allowed_mask;
> > > > + gfp_allowed_mask &= ~mask;
> > > > + return ret;
> > > > +}
> > >
> > > Fair enuf.
> > >
> > > Of course, this is all horridly racy/buggy without locking. Would I be
> > > correct in hoping that all the callers happen when the system is in
> > > everyone-is-frozen mode?
> >
> > As far as I can tell, gfp_allowed_mask is only touched during init apart from
> > this.
>
> Well yes - the new interfaces are the problem - they're racy!
>
> > > Perhaps we should add some documentation (or even an assertion) to
> > > prevent someone from using these interfaces from within normal code.
> >
> > I thought about that, but didn't invent anything smart enough.
> >
> > Well, maybe except for a comment like "this must be called with pm_mutex held",
> > because that's the only case when it would be really safe.
>
> Is that the locking rule? My above guess was incorrect?
>
> Maybe slip a
>
> BUG_ON(!mutex_is_locked(&pm_mutex));
>
> in there?
That actually is a good idea, although I prefer WARN_ON (BUG_ON would be
rather blunt, so to speak, as it could kill setups not using suspend/hibernate
at all).
Updated patch follows.
Rafael
---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: MM / PM: Force GFP_NOIO during suspend/hibernation and resume (rev. 3)
There are quite a few GFP_KERNEL memory allocations made during
suspend/hibernation and resume that may cause the system to hang,
because the I/O operations they depend on cannot be completed due to
the underlying devices being suspended.
Mitigate this problem by clearing the __GFP_IO and __GFP_FS bits in
gfp_allowed_mask before suspend/hibernation and restoring the
original values of these bits in gfp_allowed_mask durig the
subsequent resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
include/linux/gfp.h | 7 +++----
init/main.c | 2 +-
kernel/power/hibernate.c | 9 +++++++++
kernel/power/suspend.c | 3 +++
mm/page_alloc.c | 24 ++++++++++++++++++++++++
5 files changed, 40 insertions(+), 5 deletions(-)
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -83,6 +83,7 @@ struct vm_area_struct;
#define GFP_HIGHUSER_MOVABLE (__GFP_WAIT | __GFP_IO | __GFP_FS | \
__GFP_HARDWALL | __GFP_HIGHMEM | \
__GFP_MOVABLE)
+#define GFP_IOFS (__GFP_IO | __GFP_FS)
#ifdef CONFIG_NUMA
#define GFP_THISNODE (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
@@ -337,9 +338,7 @@ void drain_local_pages(void *dummy);
extern gfp_t gfp_allowed_mask;
-static inline void set_gfp_allowed_mask(gfp_t mask)
-{
- gfp_allowed_mask = mask;
-}
+extern void set_gfp_allowed_mask(gfp_t mask);
+extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
#endif /* __LINUX_GFP_H */
Index: linux-2.6/init/main.c
===================================================================
--- linux-2.6.orig/init/main.c
+++ linux-2.6/init/main.c
@@ -601,7 +601,7 @@ asmlinkage void __init start_kernel(void
local_irq_enable();
/* Interrupts are enabled now so all GFP allocations are safe. */
- set_gfp_allowed_mask(__GFP_BITS_MASK);
+ gfp_allowed_mask = __GFP_BITS_MASK;
kmem_cache_init_late();
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
int percpu_pagelist_fraction;
gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
+/*
+ * The following functions are used by the suspend/hibernate code to temporarily
+ * change gfp_allowed_mask in order to avoid using I/O during memory allocations
+ * while devices are suspended. To avoid races with the suspend/hibernate code,
+ * they should always be called with pm_mutex held (gfp_allowed_mask also should
+ * only be modified with pm_mutex held, unless the suspend/hibernate code is
+ * guaranteed not to run in parallel with that modification).
+ */
+
+void set_gfp_allowed_mask(gfp_t mask)
+{
+ WARN_ON(!mutex_is_locked(&pm_mutex));
+ gfp_allowed_mask = mask;
+}
+
+gfp_t clear_gfp_allowed_mask(gfp_t mask)
+{
+ gfp_t ret = gfp_allowed_mask;
+
+ WARN_ON(!mutex_is_locked(&pm_mutex));
+ gfp_allowed_mask &= ~mask;
+ return ret;
+}
+
#ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
int pageblock_order __read_mostly;
#endif
Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -323,6 +323,7 @@ static int create_image(int platform_mod
int hibernation_snapshot(int platform_mode)
{
int error;
+ gfp_t saved_mask;
error = platform_begin(platform_mode);
if (error)
@@ -334,6 +335,7 @@ int hibernation_snapshot(int platform_mo
goto Close;
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_FREEZE);
if (error)
goto Recover_platform;
@@ -351,6 +353,7 @@ int hibernation_snapshot(int platform_mo
dpm_resume_end(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
platform_end(platform_mode);
@@ -445,14 +448,17 @@ static int resume_target_kernel(bool pla
int hibernation_restore(int platform_mode)
{
int error;
+ gfp_t saved_mask;
pm_prepare_console();
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_QUIESCE);
if (!error) {
error = resume_target_kernel(platform_mode);
dpm_resume_end(PMSG_RECOVER);
}
+ set_gfp_allowed_mask(saved_mask);
resume_console();
pm_restore_console();
return error;
@@ -466,6 +472,7 @@ int hibernation_restore(int platform_mod
int hibernation_platform_enter(void)
{
int error;
+ gfp_t saved_mask;
if (!hibernation_ops)
return -ENOSYS;
@@ -481,6 +488,7 @@ int hibernation_platform_enter(void)
entering_platform_hibernation = true;
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_HIBERNATE);
if (error) {
if (hibernation_ops->recover)
@@ -518,6 +526,7 @@ int hibernation_platform_enter(void)
Resume_devices:
entering_platform_hibernation = false;
dpm_resume_end(PMSG_RESTORE);
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -198,6 +198,7 @@ static int suspend_enter(suspend_state_t
int suspend_devices_and_enter(suspend_state_t state)
{
int error;
+ gfp_t saved_mask;
if (!suspend_ops)
return -ENOSYS;
@@ -208,6 +209,7 @@ int suspend_devices_and_enter(suspend_st
goto Close;
}
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
suspend_test_start();
error = dpm_suspend_start(PMSG_SUSPEND);
if (error) {
@@ -224,6 +226,7 @@ int suspend_devices_and_enter(suspend_st
suspend_test_start();
dpm_resume_end(PMSG_RESUME);
suspend_test_finish("resume devices");
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
if (suspend_ops->end)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-04 0:50 ` Rafael J. Wysocki
@ 2010-02-04 1:21 ` Andrew Morton
2010-02-04 1:41 ` Andrew Morton
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-04 1:21 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Thu, 4 Feb 2010 01:50:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
> int percpu_pagelist_fraction;
> gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
>
> +/*
> + * The following functions are used by the suspend/hibernate code to temporarily
> + * change gfp_allowed_mask in order to avoid using I/O during memory allocations
> + * while devices are suspended. To avoid races with the suspend/hibernate code,
> + * they should always be called with pm_mutex held (gfp_allowed_mask also should
> + * only be modified with pm_mutex held, unless the suspend/hibernate code is
> + * guaranteed not to run in parallel with that modification).
> + */
> +
> +void set_gfp_allowed_mask(gfp_t mask)
> +{
> + WARN_ON(!mutex_is_locked(&pm_mutex));
> + gfp_allowed_mask = mask;
> +}
> +
> +gfp_t clear_gfp_allowed_mask(gfp_t mask)
> +{
> + gfp_t ret = gfp_allowed_mask;
> +
> + WARN_ON(!mutex_is_locked(&pm_mutex));
> + gfp_allowed_mask &= ~mask;
> + return ret;
> +}
Maybe put an ifdef CONFIG_foo around these so they don't get included
when they're unneeded?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-04 1:21 ` Andrew Morton
@ 2010-02-04 1:41 ` Andrew Morton
2010-02-04 19:33 ` Rafael J. Wysocki
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-02-04 1:41 UTC (permalink / raw)
To: Rafael J. Wysocki, Sebastian Ott, linux-pm, linux-kernel,
Benjamin Herrenschmidt, KOSAKI Motohiro
On Wed, 3 Feb 2010 17:21:47 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 4 Feb 2010 01:50:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
> > int percpu_pagelist_fraction;
> > gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
> >
> > +/*
> > + * The following functions are used by the suspend/hibernate code to temporarily
> > + * change gfp_allowed_mask in order to avoid using I/O during memory allocations
> > + * while devices are suspended. To avoid races with the suspend/hibernate code,
> > + * they should always be called with pm_mutex held (gfp_allowed_mask also should
> > + * only be modified with pm_mutex held, unless the suspend/hibernate code is
> > + * guaranteed not to run in parallel with that modification).
> > + */
> > +
> > +void set_gfp_allowed_mask(gfp_t mask)
> > +{
> > + WARN_ON(!mutex_is_locked(&pm_mutex));
> > + gfp_allowed_mask = mask;
> > +}
> > +
> > +gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > +{
> > + gfp_t ret = gfp_allowed_mask;
> > +
> > + WARN_ON(!mutex_is_locked(&pm_mutex));
> > + gfp_allowed_mask &= ~mask;
> > + return ret;
> > +}
>
> Maybe put an ifdef CONFIG_foo around these so they don't get included
> when they're unneeded?
>
I guess this:
mm/built-in.o: In function `clear_gfp_allowed_mask':
: undefined reference to `pm_mutex'
mm/built-in.o: In function `set_gfp_allowed_mask':
: undefined reference to `pm_mutex'
kinda answers my question.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC][PATCH] PM: disable nonboot cpus before suspending devices
2010-02-04 1:41 ` Andrew Morton
@ 2010-02-04 19:33 ` Rafael J. Wysocki
0 siblings, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2010-02-04 19:33 UTC (permalink / raw)
To: Andrew Morton
Cc: Sebastian Ott, linux-pm, linux-kernel, Benjamin Herrenschmidt,
KOSAKI Motohiro
On Thursday 04 February 2010, Andrew Morton wrote:
> On Wed, 3 Feb 2010 17:21:47 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Thu, 4 Feb 2010 01:50:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >
> > > --- linux-2.6.orig/mm/page_alloc.c
> > > +++ linux-2.6/mm/page_alloc.c
> > > @@ -76,6 +76,30 @@ unsigned long totalreserve_pages __read_
> > > int percpu_pagelist_fraction;
> > > gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
> > >
> > > +/*
> > > + * The following functions are used by the suspend/hibernate code to temporarily
> > > + * change gfp_allowed_mask in order to avoid using I/O during memory allocations
> > > + * while devices are suspended. To avoid races with the suspend/hibernate code,
> > > + * they should always be called with pm_mutex held (gfp_allowed_mask also should
> > > + * only be modified with pm_mutex held, unless the suspend/hibernate code is
> > > + * guaranteed not to run in parallel with that modification).
> > > + */
> > > +
> > > +void set_gfp_allowed_mask(gfp_t mask)
> > > +{
> > > + WARN_ON(!mutex_is_locked(&pm_mutex));
> > > + gfp_allowed_mask = mask;
> > > +}
> > > +
> > > +gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > > +{
> > > + gfp_t ret = gfp_allowed_mask;
> > > +
> > > + WARN_ON(!mutex_is_locked(&pm_mutex));
> > > + gfp_allowed_mask &= ~mask;
> > > + return ret;
> > > +}
> >
> > Maybe put an ifdef CONFIG_foo around these so they don't get included
> > when they're unneeded?
> >
>
> I guess this:
>
> mm/built-in.o: In function `clear_gfp_allowed_mask':
> : undefined reference to `pm_mutex'
> mm/built-in.o: In function `set_gfp_allowed_mask':
> : undefined reference to `pm_mutex'
>
> kinda answers my question.
Sorry about that. I tend to forget about test-compiling with CONFIG_PM unset.
The one below shouldn't have this issue.
Rafael
---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: MM / PM: Force GFP_NOIO during suspend/hibernation and resume (rev. 3)
There are quite a few GFP_KERNEL memory allocations made during
suspend/hibernation and resume that may cause the system to hang,
because the I/O operations they depend on cannot be completed due to
the underlying devices being suspended.
Mitigate this problem by clearing the __GFP_IO and __GFP_FS bits in
gfp_allowed_mask before suspend/hibernation and restoring the
original values of these bits in gfp_allowed_mask durig the
subsequent resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
include/linux/gfp.h | 7 +++----
init/main.c | 2 +-
kernel/power/hibernate.c | 9 +++++++++
kernel/power/suspend.c | 3 +++
mm/page_alloc.c | 26 ++++++++++++++++++++++++++
5 files changed, 42 insertions(+), 5 deletions(-)
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -83,6 +83,7 @@ struct vm_area_struct;
#define GFP_HIGHUSER_MOVABLE (__GFP_WAIT | __GFP_IO | __GFP_FS | \
__GFP_HARDWALL | __GFP_HIGHMEM | \
__GFP_MOVABLE)
+#define GFP_IOFS (__GFP_IO | __GFP_FS)
#ifdef CONFIG_NUMA
#define GFP_THISNODE (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
@@ -337,9 +338,7 @@ void drain_local_pages(void *dummy);
extern gfp_t gfp_allowed_mask;
-static inline void set_gfp_allowed_mask(gfp_t mask)
-{
- gfp_allowed_mask = mask;
-}
+extern void set_gfp_allowed_mask(gfp_t mask);
+extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
#endif /* __LINUX_GFP_H */
Index: linux-2.6/init/main.c
===================================================================
--- linux-2.6.orig/init/main.c
+++ linux-2.6/init/main.c
@@ -601,7 +601,7 @@ asmlinkage void __init start_kernel(void
local_irq_enable();
/* Interrupts are enabled now so all GFP allocations are safe. */
- set_gfp_allowed_mask(__GFP_BITS_MASK);
+ gfp_allowed_mask = __GFP_BITS_MASK;
kmem_cache_init_late();
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -76,6 +76,32 @@ unsigned long totalreserve_pages __read_
int percpu_pagelist_fraction;
gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
+#ifdef CONFIG_PM_SLEEP
+/*
+ * The following functions are used by the suspend/hibernate code to temporarily
+ * change gfp_allowed_mask in order to avoid using I/O during memory allocations
+ * while devices are suspended. To avoid races with the suspend/hibernate code,
+ * they should always be called with pm_mutex held (gfp_allowed_mask also should
+ * only be modified with pm_mutex held, unless the suspend/hibernate code is
+ * guaranteed not to run in parallel with that modification).
+ */
+
+void set_gfp_allowed_mask(gfp_t mask)
+{
+ WARN_ON(!mutex_is_locked(&pm_mutex));
+ gfp_allowed_mask = mask;
+}
+
+gfp_t clear_gfp_allowed_mask(gfp_t mask)
+{
+ gfp_t ret = gfp_allowed_mask;
+
+ WARN_ON(!mutex_is_locked(&pm_mutex));
+ gfp_allowed_mask &= ~mask;
+ return ret;
+}
+#endif /* CONFIG_PM_SLEEP */
+
#ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
int pageblock_order __read_mostly;
#endif
Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -323,6 +323,7 @@ static int create_image(int platform_mod
int hibernation_snapshot(int platform_mode)
{
int error;
+ gfp_t saved_mask;
error = platform_begin(platform_mode);
if (error)
@@ -334,6 +335,7 @@ int hibernation_snapshot(int platform_mo
goto Close;
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_FREEZE);
if (error)
goto Recover_platform;
@@ -351,6 +353,7 @@ int hibernation_snapshot(int platform_mo
dpm_resume_end(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
platform_end(platform_mode);
@@ -445,14 +448,17 @@ static int resume_target_kernel(bool pla
int hibernation_restore(int platform_mode)
{
int error;
+ gfp_t saved_mask;
pm_prepare_console();
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_QUIESCE);
if (!error) {
error = resume_target_kernel(platform_mode);
dpm_resume_end(PMSG_RECOVER);
}
+ set_gfp_allowed_mask(saved_mask);
resume_console();
pm_restore_console();
return error;
@@ -466,6 +472,7 @@ int hibernation_restore(int platform_mod
int hibernation_platform_enter(void)
{
int error;
+ gfp_t saved_mask;
if (!hibernation_ops)
return -ENOSYS;
@@ -481,6 +488,7 @@ int hibernation_platform_enter(void)
entering_platform_hibernation = true;
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
error = dpm_suspend_start(PMSG_HIBERNATE);
if (error) {
if (hibernation_ops->recover)
@@ -518,6 +526,7 @@ int hibernation_platform_enter(void)
Resume_devices:
entering_platform_hibernation = false;
dpm_resume_end(PMSG_RESTORE);
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -198,6 +198,7 @@ static int suspend_enter(suspend_state_t
int suspend_devices_and_enter(suspend_state_t state)
{
int error;
+ gfp_t saved_mask;
if (!suspend_ops)
return -ENOSYS;
@@ -208,6 +209,7 @@ int suspend_devices_and_enter(suspend_st
goto Close;
}
suspend_console();
+ saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
suspend_test_start();
error = dpm_suspend_start(PMSG_SUSPEND);
if (error) {
@@ -224,6 +226,7 @@ int suspend_devices_and_enter(suspend_st
suspend_test_start();
dpm_resume_end(PMSG_RESUME);
suspend_test_finish("resume devices");
+ set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
if (suspend_ops->end)
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-02-04 19:32 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-22 18:00 [RFC][PATCH] PM: disable nonboot cpus before suspending devices Sebastian Ott
2010-01-22 20:48 ` Rafael J. Wysocki
2010-01-22 21:24 ` Rafael J. Wysocki
2010-01-25 15:08 ` Sebastian Ott
2010-01-25 21:37 ` Rafael J. Wysocki
2010-02-01 14:41 ` Sebastian Ott
2010-02-01 15:30 ` Rafael J. Wysocki
2010-02-01 15:43 ` Maxim Levitsky
2010-02-01 15:57 ` Andrew Morton
2010-02-03 1:44 ` Rafael J. Wysocki
2010-02-03 1:48 ` Andrew Morton
2010-02-03 22:34 ` Rafael J. Wysocki
2010-02-03 23:08 ` Andrew Morton
2010-02-04 0:50 ` Rafael J. Wysocki
2010-02-04 1:21 ` Andrew Morton
2010-02-04 1:41 ` Andrew Morton
2010-02-04 19:33 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).