Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found] ` <4C7BE25B.1020504@superonline.com>
@ 2010-09-01  0:48   ` KOSAKI Motohiro
       [not found]   ` <20100901093219.9744.A69D9226@jp.fujitsu.com>
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2010-09-01  0:48 UTC (permalink / raw)
  To: M. Vefa Bicakci, Rafael J. Wysocki
  Cc: linux-pm, Linux Kernel Mailing List, kosaki.motohiro

> === 8< ===
> PM: Marking nosave pages: ...0009f000 - ...000100000
> PM: basic memory bitmaps created
> PM: Syncing filesystems ... done
> Freezing user space processes ... (elapsed 0.01 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> PM: Preallocating image memory...
> shrink_all_memory start
> PM: shrink memory: pass=1, req:310171 reclaimed:15492 free:360936
> PM: shrink memory: pass=2, req:294679 reclaimed:28864 free:373981
> PM: shrink memory: pass=3, req:265815 reclaimed:60311 free:405374
> PM: shrink memory: pass=4, req:205504 reclaimed:97870 free:443024
> PM: shrink memory: pass=5, req:107634 reclaimed:146948 free:492141
> shrink_all_memory: req:107634 reclaimed:146948 free:492141
> PM: preallocate_image_highmem 556658 278329
> PM: preallocate_image_memory 103139 103139
> PM: preallocate_highmem_fraction 183908 556658 760831 -> 183908
> === >8 ===

Rafael, this log mean hibernate_preallocate_memory() has a bug.
It allocate memory as following order.
 1. preallocate_image_highmem()  (i.e. __GFP_HIGHMEM)
 2. preallocate_image_memory()   (i.e. GFP_KERNEL)
 3. preallocate_highmem_fraction (i.e. __GFP_HIGHMEM)
 4. preallocate_image_memory()   (i.e. GFP_KERNEL)

But, please imazine following scenario (as Vefa's scenario).
 - system has 3GB memory. 1GB is normal. 2GB is highmem.
 - all normal memory is free
 - 1.5GB memory of highmem are used for tmpfs. rest 500MB is free.

At that time, hibernate_preallocate_memory() works as following.

1. call preallocate_image_highmem(1GB)
2. call preallocate_image_memory(500M)		total 1.5GB allocated
3. call preallocate_highmem_fraction(660M)	total 2.2GB allocated

then, all of normal zone memory was exhaust. next preallocate_image_memory()
makes OOM, and oom_killer_disabled makes infinite loop.
(oom_killer_disabled careless is vmscan bug. I'll fix it soon)

The problem is, alloc_pages(__GFP_HIGHMEM) -> alloc_pages(GFP_KERNEL) is
wrong order. alloc_pages(__GFP_HIGHMEM) may allocate page from lower zone.
then, next alloc_pages(GFP_KERNEL) lead to OOM.

Please consider alloc_pages(GFP_KERNEL) -> alloc_pages(__GFP_HIGHMEM) order.
Even though vmscan fix can avoid infinite loop, OOM situation might makes
big slow down on highmem machine. It seems no good.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]   ` <20100901093219.9744.A69D9226@jp.fujitsu.com>
@ 2010-09-01 22:02     ` Rafael J. Wysocki
       [not found]     ` <201009020002.25113.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-01 22:02 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Wednesday, September 01, 2010, KOSAKI Motohiro wrote:
> > === 8< ===
> > PM: Marking nosave pages: ...0009f000 - ...000100000
> > PM: basic memory bitmaps created
> > PM: Syncing filesystems ... done
> > Freezing user space processes ... (elapsed 0.01 seconds) done.
> > Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> > PM: Preallocating image memory...
> > shrink_all_memory start
> > PM: shrink memory: pass=1, req:310171 reclaimed:15492 free:360936
> > PM: shrink memory: pass=2, req:294679 reclaimed:28864 free:373981
> > PM: shrink memory: pass=3, req:265815 reclaimed:60311 free:405374
> > PM: shrink memory: pass=4, req:205504 reclaimed:97870 free:443024
> > PM: shrink memory: pass=5, req:107634 reclaimed:146948 free:492141
> > shrink_all_memory: req:107634 reclaimed:146948 free:492141
> > PM: preallocate_image_highmem 556658 278329
> > PM: preallocate_image_memory 103139 103139
> > PM: preallocate_highmem_fraction 183908 556658 760831 -> 183908
> > === >8 ===
> 
> Rafael, this log mean hibernate_preallocate_memory() has a bug.

Well, it works as designed ...

> It allocate memory as following order.
>  1. preallocate_image_highmem()  (i.e. __GFP_HIGHMEM)
>  2. preallocate_image_memory()   (i.e. GFP_KERNEL)
>  3. preallocate_highmem_fraction (i.e. __GFP_HIGHMEM)
>  4. preallocate_image_memory()   (i.e. GFP_KERNEL)
> 
> But, please imazine following scenario (as Vefa's scenario).
>  - system has 3GB memory. 1GB is normal. 2GB is highmem.
>  - all normal memory is free
>  - 1.5GB memory of highmem are used for tmpfs. rest 500MB is free.

Indeed, that's a memory allocation pattern I didn't anticipate.

> At that time, hibernate_preallocate_memory() works as following.
> 
> 1. call preallocate_image_highmem(1GB)
> 2. call preallocate_image_memory(500M)		total 1.5GB allocated
> 3. call preallocate_highmem_fraction(660M)	total 2.2GB allocated
> 
> then, all of normal zone memory was exhaust. next preallocate_image_memory()
> makes OOM, and oom_killer_disabled makes infinite loop.
> (oom_killer_disabled careless is vmscan bug. I'll fix it soon)

So, it looks like the problem will go away if we check if there are any normal
pages to allocate from before calling the last preallocate_image_memory()?

Like in the patch below, perhaps?

> The problem is, alloc_pages(__GFP_HIGHMEM) -> alloc_pages(GFP_KERNEL) is
> wrong order. alloc_pages(__GFP_HIGHMEM) may allocate page from lower zone.
> then, next alloc_pages(GFP_KERNEL) lead to OOM.
> 
> Please consider alloc_pages(GFP_KERNEL) -> alloc_pages(__GFP_HIGHMEM) order.
> Even though vmscan fix can avoid infinite loop, OOM situation might makes
> big slow down on highmem machine. It seems no good.

There's a problem with the ordering change that it wouldn't be clear how many
pages to request from the normal zone in step 1 and 3.

Thanks,
Rafael 

---
 kernel/power/snapshot.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1259,7 +1259,7 @@ int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, highmem, pages = 0;
-	unsigned long alloc, save_highmem, pages_highmem;
+	unsigned long alloc, save_highmem, pages_highmem, size_normal;
 	struct timeval start, stop;
 	int error;
 
@@ -1296,6 +1296,7 @@ int hibernate_preallocate_memory(void)
 		else
 			count += zone_page_state(zone, NR_FREE_PAGES);
 	}
+	size_normal = count;
 	count += highmem;
 	count -= totalreserve_pages;
 
@@ -1344,7 +1345,13 @@ int hibernate_preallocate_memory(void)
 	size = preallocate_highmem_fraction(size, highmem, count);
 	pages_highmem += size;
 	alloc -= size;
-	pages += preallocate_image_memory(alloc);
+	/* Check if there are any non-highmem pages to allocate from. */
+	if (alloc_normal < size_normal) {
+		size_normal -= alloc_normal;
+		if (alloc > size_normal)
+			alloc = size_normal;
+		pages += preallocate_image_memory(alloc);
+	}
 	pages += pages_highmem;
 
 	/*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]     ` <201009020002.25113.rjw@sisk.pl>
@ 2010-09-02  0:31       ` KOSAKI Motohiro
       [not found]       ` <20100902091010.D050.A69D9226@jp.fujitsu.com>
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2010-09-02  0:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List,
	kosaki.motohiro

> On Wednesday, September 01, 2010, KOSAKI Motohiro wrote:
> > > === 8< ===
> > > PM: Marking nosave pages: ...0009f000 - ...000100000
> > > PM: basic memory bitmaps created
> > > PM: Syncing filesystems ... done
> > > Freezing user space processes ... (elapsed 0.01 seconds) done.
> > > Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> > > PM: Preallocating image memory...
> > > shrink_all_memory start
> > > PM: shrink memory: pass=1, req:310171 reclaimed:15492 free:360936
> > > PM: shrink memory: pass=2, req:294679 reclaimed:28864 free:373981
> > > PM: shrink memory: pass=3, req:265815 reclaimed:60311 free:405374
> > > PM: shrink memory: pass=4, req:205504 reclaimed:97870 free:443024
> > > PM: shrink memory: pass=5, req:107634 reclaimed:146948 free:492141
> > > shrink_all_memory: req:107634 reclaimed:146948 free:492141
> > > PM: preallocate_image_highmem 556658 278329
> > > PM: preallocate_image_memory 103139 103139
> > > PM: preallocate_highmem_fraction 183908 556658 760831 -> 183908
> > > === >8 ===
> > 
> > Rafael, this log mean hibernate_preallocate_memory() has a bug.
> 
> Well, it works as designed ...
> 
> > It allocate memory as following order.
> >  1. preallocate_image_highmem()  (i.e. __GFP_HIGHMEM)
> >  2. preallocate_image_memory()   (i.e. GFP_KERNEL)
> >  3. preallocate_highmem_fraction (i.e. __GFP_HIGHMEM)
> >  4. preallocate_image_memory()   (i.e. GFP_KERNEL)
> > 
> > But, please imazine following scenario (as Vefa's scenario).
> >  - system has 3GB memory. 1GB is normal. 2GB is highmem.
> >  - all normal memory is free
> >  - 1.5GB memory of highmem are used for tmpfs. rest 500MB is free.
> 
> Indeed, that's a memory allocation pattern I didn't anticipate.
> 
> > At that time, hibernate_preallocate_memory() works as following.
> > 
> > 1. call preallocate_image_highmem(1GB)
> > 2. call preallocate_image_memory(500M)		total 1.5GB allocated
> > 3. call preallocate_highmem_fraction(660M)	total 2.2GB allocated
> > 
> > then, all of normal zone memory was exhaust. next preallocate_image_memory()
> > makes OOM, and oom_killer_disabled makes infinite loop.
> > (oom_killer_disabled careless is vmscan bug. I'll fix it soon)
> 
> So, it looks like the problem will go away if we check if there are any normal
> pages to allocate from before calling the last preallocate_image_memory()?
> 
> Like in the patch below, perhaps?

Looks like fine. but I have one question. hibernate_preallocate_memory() call
preallocate_image_memory() two times. Why do you only care latter one?
former one seems similar risk.


> > The problem is, alloc_pages(__GFP_HIGHMEM) -> alloc_pages(GFP_KERNEL) is
> > wrong order. alloc_pages(__GFP_HIGHMEM) may allocate page from lower zone.
> > then, next alloc_pages(GFP_KERNEL) lead to OOM.
> > 
> > Please consider alloc_pages(GFP_KERNEL) -> alloc_pages(__GFP_HIGHMEM) order.
> > Even though vmscan fix can avoid infinite loop, OOM situation might makes
> > big slow down on highmem machine. It seems no good.
> 
> There's a problem with the ordering change that it wouldn't be clear how many
> pages to request from the normal zone in step 1 and 3.

ok, I see. thanks for good correction my mistake.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]       ` <20100902091010.D050.A69D9226@jp.fujitsu.com>
@ 2010-09-02 19:57         ` Rafael J. Wysocki
       [not found]         ` <201009022157.18561.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-02 19:57 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Thursday, September 02, 2010, KOSAKI Motohiro wrote:
> > On Wednesday, September 01, 2010, KOSAKI Motohiro wrote:
> > > > === 8< ===
> > > > PM: Marking nosave pages: ...0009f000 - ...000100000
> > > > PM: basic memory bitmaps created
> > > > PM: Syncing filesystems ... done
> > > > Freezing user space processes ... (elapsed 0.01 seconds) done.
> > > > Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> > > > PM: Preallocating image memory...
> > > > shrink_all_memory start
> > > > PM: shrink memory: pass=1, req:310171 reclaimed:15492 free:360936
> > > > PM: shrink memory: pass=2, req:294679 reclaimed:28864 free:373981
> > > > PM: shrink memory: pass=3, req:265815 reclaimed:60311 free:405374
> > > > PM: shrink memory: pass=4, req:205504 reclaimed:97870 free:443024
> > > > PM: shrink memory: pass=5, req:107634 reclaimed:146948 free:492141
> > > > shrink_all_memory: req:107634 reclaimed:146948 free:492141
> > > > PM: preallocate_image_highmem 556658 278329
> > > > PM: preallocate_image_memory 103139 103139
> > > > PM: preallocate_highmem_fraction 183908 556658 760831 -> 183908
> > > > === >8 ===
> > > 
> > > Rafael, this log mean hibernate_preallocate_memory() has a bug.
> > 
> > Well, it works as designed ...
> > 
> > > It allocate memory as following order.
> > >  1. preallocate_image_highmem()  (i.e. __GFP_HIGHMEM)
> > >  2. preallocate_image_memory()   (i.e. GFP_KERNEL)
> > >  3. preallocate_highmem_fraction (i.e. __GFP_HIGHMEM)
> > >  4. preallocate_image_memory()   (i.e. GFP_KERNEL)
> > > 
> > > But, please imazine following scenario (as Vefa's scenario).
> > >  - system has 3GB memory. 1GB is normal. 2GB is highmem.
> > >  - all normal memory is free
> > >  - 1.5GB memory of highmem are used for tmpfs. rest 500MB is free.
> > 
> > Indeed, that's a memory allocation pattern I didn't anticipate.
> > 
> > > At that time, hibernate_preallocate_memory() works as following.
> > > 
> > > 1. call preallocate_image_highmem(1GB)
> > > 2. call preallocate_image_memory(500M)		total 1.5GB allocated
> > > 3. call preallocate_highmem_fraction(660M)	total 2.2GB allocated
> > > 
> > > then, all of normal zone memory was exhaust. next preallocate_image_memory()
> > > makes OOM, and oom_killer_disabled makes infinite loop.
> > > (oom_killer_disabled careless is vmscan bug. I'll fix it soon)
> > 
> > So, it looks like the problem will go away if we check if there are any normal
> > pages to allocate from before calling the last preallocate_image_memory()?
> > 
> > Like in the patch below, perhaps?
> 
> Looks like fine. but I have one question. hibernate_preallocate_memory() call
> preallocate_image_memory() two times. Why do you only care latter one?
> former one seems similar risk.

The first one is mandatory, ie. if we can't allocate the requested number of
pages at this point, we fail the entire hibernation.  In that case the
performance hit doesn't matter.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]         ` <201009022157.18561.rjw@sisk.pl>
@ 2010-09-02 20:24           ` Rafael J. Wysocki
  2010-09-03  0:13             ` KOSAKI Motohiro
       [not found]             ` <20100903085756.B657.A69D9226@jp.fujitsu.com>
  0 siblings, 2 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-02 20:24 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Thursday, September 02, 2010, Rafael J. Wysocki wrote:
> On Thursday, September 02, 2010, KOSAKI Motohiro wrote:
> > > On Wednesday, September 01, 2010, KOSAKI Motohiro wrote:
> > > > > === 8< ===
> > > > > PM: Marking nosave pages: ...0009f000 - ...000100000
> > > > > PM: basic memory bitmaps created
> > > > > PM: Syncing filesystems ... done
> > > > > Freezing user space processes ... (elapsed 0.01 seconds) done.
> > > > > Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> > > > > PM: Preallocating image memory...
> > > > > shrink_all_memory start
> > > > > PM: shrink memory: pass=1, req:310171 reclaimed:15492 free:360936
> > > > > PM: shrink memory: pass=2, req:294679 reclaimed:28864 free:373981
> > > > > PM: shrink memory: pass=3, req:265815 reclaimed:60311 free:405374
> > > > > PM: shrink memory: pass=4, req:205504 reclaimed:97870 free:443024
> > > > > PM: shrink memory: pass=5, req:107634 reclaimed:146948 free:492141
> > > > > shrink_all_memory: req:107634 reclaimed:146948 free:492141
> > > > > PM: preallocate_image_highmem 556658 278329
> > > > > PM: preallocate_image_memory 103139 103139
> > > > > PM: preallocate_highmem_fraction 183908 556658 760831 -> 183908
> > > > > === >8 ===
> > > > 
> > > > Rafael, this log mean hibernate_preallocate_memory() has a bug.
> > > 
> > > Well, it works as designed ...
> > > 
> > > > It allocate memory as following order.
> > > >  1. preallocate_image_highmem()  (i.e. __GFP_HIGHMEM)
> > > >  2. preallocate_image_memory()   (i.e. GFP_KERNEL)
> > > >  3. preallocate_highmem_fraction (i.e. __GFP_HIGHMEM)
> > > >  4. preallocate_image_memory()   (i.e. GFP_KERNEL)
> > > > 
> > > > But, please imazine following scenario (as Vefa's scenario).
> > > >  - system has 3GB memory. 1GB is normal. 2GB is highmem.
> > > >  - all normal memory is free
> > > >  - 1.5GB memory of highmem are used for tmpfs. rest 500MB is free.
> > > 
> > > Indeed, that's a memory allocation pattern I didn't anticipate.
> > > 
> > > > At that time, hibernate_preallocate_memory() works as following.
> > > > 
> > > > 1. call preallocate_image_highmem(1GB)
> > > > 2. call preallocate_image_memory(500M)		total 1.5GB allocated
> > > > 3. call preallocate_highmem_fraction(660M)	total 2.2GB allocated
> > > > 
> > > > then, all of normal zone memory was exhaust. next preallocate_image_memory()
> > > > makes OOM, and oom_killer_disabled makes infinite loop.
> > > > (oom_killer_disabled careless is vmscan bug. I'll fix it soon)
> > > 
> > > So, it looks like the problem will go away if we check if there are any normal
> > > pages to allocate from before calling the last preallocate_image_memory()?
> > > 
> > > Like in the patch below, perhaps?
> > 
> > Looks like fine. but I have one question. hibernate_preallocate_memory() call
> > preallocate_image_memory() two times. Why do you only care latter one?
> > former one seems similar risk.
> 
> The first one is mandatory, ie. if we can't allocate the requested number of
> pages at this point, we fail the entire hibernation.  In that case the
> performance hit doesn't matter.

IOW, your patch at http://lkml.org/lkml/2010/9/2/262 is still necessary to
protect against the infinite loop in that case.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
  2010-09-02 20:24           ` Rafael J. Wysocki
@ 2010-09-03  0:13             ` KOSAKI Motohiro
       [not found]             ` <20100903085756.B657.A69D9226@jp.fujitsu.com>
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2010-09-03  0:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List,
	kosaki.motohiro

Hello,

> > > > Like in the patch below, perhaps?
> > > 
> > > Looks like fine. but I have one question. hibernate_preallocate_memory() call
> > > preallocate_image_memory() two times. Why do you only care latter one?
> > > former one seems similar risk.
> > 
> > The first one is mandatory, ie. if we can't allocate the requested number of
> > pages at this point, we fail the entire hibernation.  In that case the
> > performance hit doesn't matter.
> 
> IOW, your patch at http://lkml.org/lkml/2010/9/2/262 is still necessary to
> protect against the infinite loop in that case.

As far as I understand, we need distinguish two allocation failure.
  1) failure because no enough memory
	-> yes, hibernation should fail
 2) failure because already allocated enough lower zone memory
	-> why should we fail?

If the system has a lot of memory, scenario (2) is happen frequently than (1).
I think we need check alloc_highmem and alloc_normal variable and call
preallocate_image_highmem() again instead preallocate_image_memory()
if we've alread allocated enough lots normal memory.

nit?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]             ` <20100903085756.B657.A69D9226@jp.fujitsu.com>
@ 2010-09-03  1:07               ` Rafael J. Wysocki
  2010-09-03  1:53                 ` KOSAKI Motohiro
       [not found]                 ` <20100903105216.B65C.A69D9226@jp.fujitsu.com>
  0 siblings, 2 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-03  1:07 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Friday, September 03, 2010, KOSAKI Motohiro wrote:
> Hello,
> 
> > > > > Like in the patch below, perhaps?
> > > > 
> > > > Looks like fine. but I have one question. hibernate_preallocate_memory() call
> > > > preallocate_image_memory() two times. Why do you only care latter one?
> > > > former one seems similar risk.
> > > 
> > > The first one is mandatory, ie. if we can't allocate the requested number of
> > > pages at this point, we fail the entire hibernation.  In that case the
> > > performance hit doesn't matter.
> > 
> > IOW, your patch at http://lkml.org/lkml/2010/9/2/262 is still necessary to
> > protect against the infinite loop in that case.
> 
> As far as I understand, we need distinguish two allocation failure.
>   1) failure because no enough memory
> 	-> yes, hibernation should fail
>  2) failure because already allocated enough lower zone memory
> 	-> why should we fail?
> 
> If the system has a lot of memory, scenario (2) is happen frequently than (1).
> I think we need check alloc_highmem and alloc_normal variable and call
> preallocate_image_highmem() again instead preallocate_image_memory()
> if we've alread allocated enough lots normal memory.
> 
> nit?

Actually I thought about that, but we don't really see hibernation fail for
this reason.  In all of the tests I carried out the requested 50% of highmem
had been allocated before allocations from the normal zone started to be
made, even if highmem was 100% full at that point.  So this appears to be
a theoretical issue and covering it would require us to change the algorithm
entirely (eg. it doesn't make sense to call preallocate_highmem_fraction() down
the road if that happens).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
  2010-09-03  1:07               ` Rafael J. Wysocki
@ 2010-09-03  1:53                 ` KOSAKI Motohiro
       [not found]                 ` <20100903105216.B65C.A69D9226@jp.fujitsu.com>
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2010-09-03  1:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List,
	kosaki.motohiro

> On Friday, September 03, 2010, KOSAKI Motohiro wrote:
> > Hello,
> > 
> > > > > > Like in the patch below, perhaps?
> > > > > 
> > > > > Looks like fine. but I have one question. hibernate_preallocate_memory() call
> > > > > preallocate_image_memory() two times. Why do you only care latter one?
> > > > > former one seems similar risk.
> > > > 
> > > > The first one is mandatory, ie. if we can't allocate the requested number of
> > > > pages at this point, we fail the entire hibernation.  In that case the
> > > > performance hit doesn't matter.
> > > 
> > > IOW, your patch at http://lkml.org/lkml/2010/9/2/262 is still necessary to
> > > protect against the infinite loop in that case.
> > 
> > As far as I understand, we need distinguish two allocation failure.
> >   1) failure because no enough memory
> > 	-> yes, hibernation should fail
> >  2) failure because already allocated enough lower zone memory
> > 	-> why should we fail?
> > 
> > If the system has a lot of memory, scenario (2) is happen frequently than (1).
> > I think we need check alloc_highmem and alloc_normal variable and call
> > preallocate_image_highmem() again instead preallocate_image_memory()
> > if we've alread allocated enough lots normal memory.
> > 
> > nit?
> 
> Actually I thought about that, but we don't really see hibernation fail for
> this reason.  In all of the tests I carried out the requested 50% of highmem
> had been allocated before allocations from the normal zone started to be
> made, even if highmem was 100% full at that point.  So this appears to be
> a theoretical issue and covering it would require us to change the algorithm
> entirely (eg. it doesn't make sense to call preallocate_highmem_fraction() down
> the road if that happens).

ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
for your fix.

thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]                 ` <20100903105216.B65C.A69D9226@jp.fujitsu.com>
@ 2010-09-04  1:44                   ` Rafael J. Wysocki
       [not found]                   ` <201009040344.42342.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-04  1:44 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Friday, September 03, 2010, KOSAKI Motohiro wrote:
> > On Friday, September 03, 2010, KOSAKI Motohiro wrote:
> > > Hello,
> > > 
> > > > > > > Like in the patch below, perhaps?
> > > > > > 
> > > > > > Looks like fine. but I have one question. hibernate_preallocate_memory() call
> > > > > > preallocate_image_memory() two times. Why do you only care latter one?
> > > > > > former one seems similar risk.
> > > > > 
> > > > > The first one is mandatory, ie. if we can't allocate the requested number of
> > > > > pages at this point, we fail the entire hibernation.  In that case the
> > > > > performance hit doesn't matter.
> > > > 
> > > > IOW, your patch at http://lkml.org/lkml/2010/9/2/262 is still necessary to
> > > > protect against the infinite loop in that case.
> > > 
> > > As far as I understand, we need distinguish two allocation failure.
> > >   1) failure because no enough memory
> > > 	-> yes, hibernation should fail
> > >  2) failure because already allocated enough lower zone memory
> > > 	-> why should we fail?
> > > 
> > > If the system has a lot of memory, scenario (2) is happen frequently than (1).
> > > I think we need check alloc_highmem and alloc_normal variable and call
> > > preallocate_image_highmem() again instead preallocate_image_memory()
> > > if we've alread allocated enough lots normal memory.
> > > 
> > > nit?
> > 
> > Actually I thought about that, but we don't really see hibernation fail for
> > this reason.  In all of the tests I carried out the requested 50% of highmem
> > had been allocated before allocations from the normal zone started to be
> > made, even if highmem was 100% full at that point.  So this appears to be
> > a theoretical issue and covering it would require us to change the algorithm
> > entirely (eg. it doesn't make sense to call preallocate_highmem_fraction() down
> > the road if that happens).
> 
> ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
> for your fix.

Thanks.

In the meantime, though, I prepared a patch that should address the issue
entirely.  The patch is appended and if it looks good to you, I'd rather use it
instead of the previous one (it is still untested).

Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Hibernate: Avoid hitting OOM during preallocation of memory

There is a problem in hibernate_preallocate_memory() that it calls
preallocate_image_memory() with an argument that may be greater than
the number of available non-highmem memory pages.  This may trigger
the OOM condition which in turn can cause significant slowdown to
occur.

To avoid that, modify preallocate_image_memory() so that it checks
if there is a sufficient number of non-highmem pages to allocate from
before calling preallocate_image_pages() and change
hibernate_preallocate_memory() to try to allocate from highmem if
the number of pages allocated by preallocate_image_memory() is too
low.

Adjust free_unnecessary_pages() to take all possible memory
allocation patterns into account.

Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |   66 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 46 insertions(+), 20 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1122,9 +1122,19 @@ static unsigned long preallocate_image_p
 	return nr_alloc;
 }
 
-static unsigned long preallocate_image_memory(unsigned long nr_pages)
+static unsigned long preallocate_image_memory(unsigned long nr_pages,
+					      unsigned long size_normal)
 {
-	return preallocate_image_pages(nr_pages, GFP_IMAGE);
+	unsigned long alloc;
+
+	if (size_normal <= alloc_normal)
+		return 0;
+
+	alloc = size_normal - alloc_normal;
+	if (nr_pages < alloc)
+		alloc = nr_pages;
+
+	return preallocate_image_pages(alloc, GFP_IMAGE);
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1170,15 +1180,22 @@ static inline unsigned long preallocate_
  */
 static void free_unnecessary_pages(void)
 {
-	unsigned long save_highmem, to_free_normal, to_free_highmem;
+	unsigned long save, to_free_normal, to_free_highmem;
 
-	to_free_normal = alloc_normal - count_data_pages();
-	save_highmem = count_highmem_pages();
-	if (alloc_highmem > save_highmem) {
-		to_free_highmem = alloc_highmem - save_highmem;
+	save = count_data_pages();
+	if (alloc_normal >= save) {
+		to_free_normal = alloc_normal - save;
+		save = 0;
+	} else {
+		to_free_normal = 0;
+		save -= alloc_normal;
+	}
+	save += count_highmem_pages();
+	if (alloc_highmem >= save) {
+		to_free_highmem = alloc_highmem - save;
 	} else {
 		to_free_highmem = 0;
-		to_free_normal -= save_highmem - alloc_highmem;
+		to_free_normal -= save - alloc_highmem;
 	}
 
 	memory_bm_position_reset(&copy_bm);
@@ -1259,7 +1276,7 @@ int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, highmem, pages = 0;
-	unsigned long alloc, save_highmem, pages_highmem;
+	unsigned long alloc, save_highmem, pages_highmem, size_normal;
 	struct timeval start, stop;
 	int error;
 
@@ -1296,6 +1313,7 @@ int hibernate_preallocate_memory(void)
 		else
 			count += zone_page_state(zone, NR_FREE_PAGES);
 	}
+	size_normal = count;
 	count += highmem;
 	count -= totalreserve_pages;
 
@@ -1310,7 +1328,7 @@ int hibernate_preallocate_memory(void)
 	 */
 	if (size >= saveable) {
 		pages = preallocate_image_highmem(save_highmem);
-		pages += preallocate_image_memory(saveable - pages);
+		pages += preallocate_image_memory(saveable - pages, size_normal);
 		goto out;
 	}
 
@@ -1336,16 +1354,24 @@ int hibernate_preallocate_memory(void)
 	 */
 	pages_highmem = preallocate_image_highmem(highmem / 2);
 	alloc = (count - max_size) - pages_highmem;
-	pages = preallocate_image_memory(alloc);
-	if (pages < alloc)
-		goto err_out;
-	size = max_size - size;
-	alloc = size;
-	size = preallocate_highmem_fraction(size, highmem, count);
-	pages_highmem += size;
-	alloc -= size;
-	pages += preallocate_image_memory(alloc);
-	pages += pages_highmem;
+	pages = preallocate_image_memory(alloc, size_normal);
+	if (pages < alloc) {
+		/* We have exhausted non-highmem pages, try highmem. */
+		alloc -= pages;
+		pages = preallocate_image_highmem(alloc);
+		if (pages < alloc)
+			goto err_out;
+		pages += preallocate_image_highmem(max_size - size);
+	} else {
+		size = max_size - size;
+		alloc = size;
+		size = preallocate_highmem_fraction(size, highmem, count);
+		pages_highmem += size;
+		alloc -= size;
+		size = preallocate_image_memory(alloc, size_normal);
+		pages_highmem += preallocate_image_highmem(alloc - size);
+		pages += pages_highmem + size;
+	}
 
 	/*
 	 * We only need as many page frames for the image as there are saveable

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]                   ` <201009040344.42342.rjw@sisk.pl>
@ 2010-09-06  2:08                     ` KOSAKI Motohiro
       [not found]                     ` <20100906090528.C8A9.A69D9226@jp.fujitsu.com>
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2010-09-06  2:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List,
	kosaki.motohiro

> > ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
> > for your fix.
> 
> Thanks.
> 
> In the meantime, though, I prepared a patch that should address the issue
> entirely.  The patch is appended and if it looks good to you, I'd rather use it
> instead of the previous one (it is still untested).

Yeah, this one looks nicer to me :)

Thanks, rafael!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Important news regarding the two different patches
       [not found]                     ` <20100906090528.C8A9.A69D9226@jp.fujitsu.com>
@ 2010-09-06 11:27                       ` M. Vefa Bicakci
       [not found]                       ` <4C84D02A.7070508@superonline.com>
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: M. Vefa Bicakci @ 2010-09-06 11:27 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, Linux Kernel Mailing List

On 05/09/10 10:08 PM, KOSAKI Motohiro wrote:
>>> ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
>>> for your fix.
>>
>> Thanks.
>>
>> In the meantime, though, I prepared a patch that should address the issue
>> entirely.  The patch is appended and if it looks good to you, I'd rather use it
>> instead of the previous one (it is still untested).
> 
> Yeah, this one looks nicer to me :)
> 
> Thanks, rafael!

Dear Rafael Wysocki, Kosaki Motohiro and Minchan Kim,

Upon Kosaki Motohiro's kind request via an off-list e-mail,
I tested the following two patches separately with a vanilla
2.6.35.4 kernel:

Patch 1:
	http://lkml.org/lkml/2010/9/5/86

Patch 2:
	http://kerneltrap.org/mailarchive/linux-kernel/2010/9/4/4615426

The first of these was prepared by Minchan Kim, and it fixes
the issue; i.e. no hangs during hibernation with a full tmpfs.

However, the second patch, prepared by Rafael Wysocki, does *not*
fix the problem. I still experience hangs with a full tmpfs upon
hibernation.

As always, I am willing to test newer patches and help in debugging
this issue.

I really appreciate all of your help,

M. Vefa Bicakci

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Important news regarding the two different patches
       [not found]                       ` <4C84D02A.7070508@superonline.com>
@ 2010-09-06 18:43                         ` Rafael J. Wysocki
       [not found]                         ` <201009062043.37819.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-06 18:43 UTC (permalink / raw)
  To: M. Vefa Bicakci; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On Monday, September 06, 2010, M. Vefa Bicakci wrote:
> On 05/09/10 10:08 PM, KOSAKI Motohiro wrote:
> >>> ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
> >>> for your fix.
> >>
> >> Thanks.
> >>
> >> In the meantime, though, I prepared a patch that should address the issue
> >> entirely.  The patch is appended and if it looks good to you, I'd rather use it
> >> instead of the previous one (it is still untested).
> > 
> > Yeah, this one looks nicer to me :)
> > 
> > Thanks, rafael!
> 
> Dear Rafael Wysocki, Kosaki Motohiro and Minchan Kim,
> 
> Upon Kosaki Motohiro's kind request via an off-list e-mail,
> I tested the following two patches separately with a vanilla
> 2.6.35.4 kernel:
> 
> Patch 1:
> 	http://lkml.org/lkml/2010/9/5/86
> 
> Patch 2:
> 	http://kerneltrap.org/mailarchive/linux-kernel/2010/9/4/4615426
> 
> The first of these was prepared by Minchan Kim, and it fixes
> the issue; i.e. no hangs during hibernation with a full tmpfs.

We've already agreed that this patch should be applied.

> However, the second patch, prepared by Rafael Wysocki, does *not*
> fix the problem. I still experience hangs with a full tmpfs upon
> hibernation.

So we still manage to trigger OOM with this approach.

> As always, I am willing to test newer patches and help in debugging
> this issue.

What happens if you apply them both at the same time?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]                     ` <20100906090528.C8A9.A69D9226@jp.fujitsu.com>
  2010-09-06 11:27                       ` Important news regarding the two different patches M. Vefa Bicakci
       [not found]                       ` <4C84D02A.7070508@superonline.com>
@ 2010-09-06 18:46                       ` Rafael J. Wysocki
       [not found]                       ` <201009062046.43513.rjw@sisk.pl>
  3 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-06 18:46 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Monday, September 06, 2010, KOSAKI Motohiro wrote:
> > > ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
> > > for your fix.
> > 
> > Thanks.
> > 
> > In the meantime, though, I prepared a patch that should address the issue
> > entirely.  The patch is appended and if it looks good to you, I'd rather use it
> > instead of the previous one (it is still untested).
> 
> Yeah, this one looks nicer to me :)
> 
> Thanks, rafael!

OK, I'll put it into my linux-next branch, then.

Probably, though, I should modify the changelog, because what it really does
is to check if it makes sense to try to allocat from non-highmem pages, but it
doesn't really prevent the OOM from occuring.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang
       [not found]                       ` <201009062046.43513.rjw@sisk.pl>
@ 2010-09-06 19:54                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-06 19:54 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-pm, M. Vefa Bicakci, Linux Kernel Mailing List

On Monday, September 06, 2010, Rafael J. Wysocki wrote:
> On Monday, September 06, 2010, KOSAKI Motohiro wrote:
> > > > ok, thanks. probably I've catched your point. please feel free to use my reviewed-by
> > > > for your fix.
> > > 
> > > Thanks.
> > > 
> > > In the meantime, though, I prepared a patch that should address the issue
> > > entirely.  The patch is appended and if it looks good to you, I'd rather use it
> > > instead of the previous one (it is still untested).
> > 
> > Yeah, this one looks nicer to me :)
> > 
> > Thanks, rafael!
> 
> OK, I'll put it into my linux-next branch, then.
> 
> Probably, though, I should modify the changelog, because what it really does
> is to check if it makes sense to try to allocat from non-highmem pages, but it
> doesn't really prevent the OOM from occuring.

For completness, below is the patch with the new changelog.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Hibernate: Avoid hitting OOM during preallocation of memory

There is a problem in hibernate_preallocate_memory() that it calls
preallocate_image_memory() with an argument that may be greater than
the total number of non-highmem memory pages that haven't been
already preallocated.  If that's the case, the OOM condition is
guaranteed to trigger, which in turn can cause significant slowdown
to occur.

To avoid that, make preallocate_image_memory() adjust its argument
before calling preallocate_image_pages(), so that it doesn't exceed
the number of non-highmem pages that weren't preallocated previously.
Change hibernate_preallocate_memory() to try to allocate from highmem
if the number of pages allocated by preallocate_image_memory() is too
low.  Modify free_unnecessary_pages() to take all possible memory
allocation patterns into account.

Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |   66 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 46 insertions(+), 20 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1122,9 +1122,19 @@ static unsigned long preallocate_image_p
 	return nr_alloc;
 }
 
-static unsigned long preallocate_image_memory(unsigned long nr_pages)
+static unsigned long preallocate_image_memory(unsigned long nr_pages,
+					      unsigned long size_normal)
 {
-	return preallocate_image_pages(nr_pages, GFP_IMAGE);
+	unsigned long alloc;
+
+	if (size_normal <= alloc_normal)
+		return 0;
+
+	alloc = size_normal - alloc_normal;
+	if (nr_pages < alloc)
+		alloc = nr_pages;
+
+	return preallocate_image_pages(alloc, GFP_IMAGE);
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1170,15 +1180,22 @@ static inline unsigned long preallocate_
  */
 static void free_unnecessary_pages(void)
 {
-	unsigned long save_highmem, to_free_normal, to_free_highmem;
+	unsigned long save, to_free_normal, to_free_highmem;
 
-	to_free_normal = alloc_normal - count_data_pages();
-	save_highmem = count_highmem_pages();
-	if (alloc_highmem > save_highmem) {
-		to_free_highmem = alloc_highmem - save_highmem;
+	save = count_data_pages();
+	if (alloc_normal >= save) {
+		to_free_normal = alloc_normal - save;
+		save = 0;
+	} else {
+		to_free_normal = 0;
+		save -= alloc_normal;
+	}
+	save += count_highmem_pages();
+	if (alloc_highmem >= save) {
+		to_free_highmem = alloc_highmem - save;
 	} else {
 		to_free_highmem = 0;
-		to_free_normal -= save_highmem - alloc_highmem;
+		to_free_normal -= save - alloc_highmem;
 	}
 
 	memory_bm_position_reset(&copy_bm);
@@ -1259,7 +1276,7 @@ int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, highmem, pages = 0;
-	unsigned long alloc, save_highmem, pages_highmem;
+	unsigned long alloc, save_highmem, pages_highmem, size_normal;
 	struct timeval start, stop;
 	int error;
 
@@ -1296,6 +1313,7 @@ int hibernate_preallocate_memory(void)
 		else
 			count += zone_page_state(zone, NR_FREE_PAGES);
 	}
+	size_normal = count;
 	count += highmem;
 	count -= totalreserve_pages;
 
@@ -1310,7 +1328,7 @@ int hibernate_preallocate_memory(void)
 	 */
 	if (size >= saveable) {
 		pages = preallocate_image_highmem(save_highmem);
-		pages += preallocate_image_memory(saveable - pages);
+		pages += preallocate_image_memory(saveable - pages, size_normal);
 		goto out;
 	}
 
@@ -1336,16 +1354,24 @@ int hibernate_preallocate_memory(void)
 	 */
 	pages_highmem = preallocate_image_highmem(highmem / 2);
 	alloc = (count - max_size) - pages_highmem;
-	pages = preallocate_image_memory(alloc);
-	if (pages < alloc)
-		goto err_out;
-	size = max_size - size;
-	alloc = size;
-	size = preallocate_highmem_fraction(size, highmem, count);
-	pages_highmem += size;
-	alloc -= size;
-	pages += preallocate_image_memory(alloc);
-	pages += pages_highmem;
+	pages = preallocate_image_memory(alloc, size_normal);
+	if (pages < alloc) {
+		/* We have exhausted non-highmem pages, try highmem. */
+		alloc -= pages;
+		pages = preallocate_image_highmem(alloc);
+		if (pages < alloc)
+			goto err_out;
+		pages += preallocate_image_highmem(max_size - size);
+	} else {
+		size = max_size - size;
+		alloc = size;
+		size = preallocate_highmem_fraction(size, highmem, count);
+		pages_highmem += size;
+		alloc -= size;
+		size = preallocate_image_memory(alloc, size_normal);
+		pages_highmem += preallocate_image_highmem(alloc - size);
+		pages += pages_highmem + size;
+	}
 
 	/*
 	 * We only need as many page frames for the image as there are saveable

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Important news regarding the two different patches
       [not found]                         ` <201009062043.37819.rjw@sisk.pl>
@ 2010-09-07  1:34                           ` M. Vefa Bicakci
       [not found]                           ` <4C8596B1.6010005@superonline.com>
  1 sibling, 0 replies; 24+ messages in thread
From: M. Vefa Bicakci @ 2010-09-07  1:34 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On 06/09/10 02:43 PM, Rafael J. Wysocki wrote:
> On Monday, September 06, 2010, M. Vefa Bicakci wrote:
>> [snip]
>>
>> Patch 1:
>> 	http://lkml.org/lkml/2010/9/5/86
>>
>> Patch 2:
>> 	http://kerneltrap.org/mailarchive/linux-kernel/2010/9/4/4615426
>>
>> The first of these was prepared by Minchan Kim, and it fixes
>> the issue; i.e. no hangs during hibernation with a full tmpfs.
> 
> We've already agreed that this patch should be applied.
> 
>> However, the second patch, prepared by Rafael Wysocki, does *not*
>> fix the problem. I still experience hangs with a full tmpfs upon
>> hibernation.
> 
> So we still manage to trigger OOM with this approach.
> 
>> As always, I am willing to test newer patches and help in debugging
>> this issue.
> 
> What happens if you apply them both at the same time?

Hello,

When I apply both of the patches, then I don't get any hangs with
hibernation. However, I do get another problem, which I am not sure
is related or not. I should note that I haven't experienced this
with only the vmscan.c patch, but maybe I haven't repeated my test
enough times.

One test consists of an automated run of 7 hibernate/thaw cycles. 

Here's what I got in dmesg in two of the iterations in one test.
Sorry for the long e-mail and the long lines.

=== 8< ===
[  166.512085] PM: Hibernation mode set to 'reboot'
[  166.516503] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  166.517654] PM: Basic memory bitmaps created
[  166.518781] PM: Syncing filesystems ... done.
[  166.546308] Freezing user space processes ... (elapsed 0.01 seconds) done.
[  166.559596] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[  166.571649] PM: Preallocating image memory... 
[  185.712457] iwl3945: page allocation failure. order:0, mode:0xd0
[  185.714564] Pid: 1225, comm: iwl3945 Not tainted 2.6.35.4-test-mm5v2-vmscan+snapshot-dirty #7
[  185.715741] Call Trace:
[  185.716853]  [<c019aa67>] ? __alloc_pages_nodemask+0x577/0x630
[  185.718126]  [<f8a562c5>] ? iwl3945_rx_allocate+0x75/0x240 [iwl3945]
[  185.719379]  [<c03f0516>] ? schedule+0x356/0x730
[  185.720556]  [<f8a56d50>] ? iwl3945_rx_replenish+0x20/0x50 [iwl3945]
[  185.721914]  [<f8a56dbc>] ? iwl3945_bg_rx_replenish+0x3c/0x50 [iwl3945]
[  185.723929]  [<c014b167>] ? worker_thread+0x117/0x1f0
[  185.725745]  [<f8a56d80>] ? iwl3945_bg_rx_replenish+0x0/0x50 [iwl3945]
[  185.727097]  [<c014ebd0>] ? autoremove_wake_function+0x0/0x40
[  185.728468]  [<c014b050>] ? worker_thread+0x0/0x1f0
[  185.730235]  [<c014e854>] ? kthread+0x74/0x80
[  185.731601]  [<c014e7e0>] ? kthread+0x0/0x80
[  185.732919]  [<c0103cb6>] ? kernel_thread_helper+0x6/0x10
[  185.734851] Mem-Info:
[  185.736144] DMA per-cpu:
[  185.737439] CPU    0: hi:    0, btch:   1 usd:   0
[  185.738635] CPU    1: hi:    0, btch:   1 usd:   0
[  185.740296] Normal per-cpu:
[  185.741560] CPU    0: hi:  186, btch:  31 usd: 173
[  185.742752] CPU    1: hi:  186, btch:  31 usd:  49
[  185.744092] HighMem per-cpu:
[  185.745897] CPU    0: hi:  186, btch:  31 usd: 159
[  185.747239] CPU    1: hi:  186, btch:  31 usd: 170
[  185.748547] active_anon:136410 inactive_anon:34149 isolated_anon:0
[  185.748549]  active_file:689 inactive_file:344 isolated_file:0
[  185.748550]  unevictable:0 dirty:0 writeback:392 unstable:0
[  185.748551]  free:1928 slab_reclaimable:933 slab_unreclaimable:1680
[  185.748552]  mapped:952 shmem:168120 pagetables:304 bounce:0
[  185.755444] DMA free:3516kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15804kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  185.759708] lowmem_reserve[]: 0 865 3022 3022
[  185.761135] Normal free:3444kB min:3728kB low:4660kB high:5592kB active_anon:0kB inactive_anon:0kB active_file:52kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:885944kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:3732kB slab_unreclaimable:6712kB kernel_stack:1480kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:96 all_unreclaimable? no
[  185.765935] lowmem_reserve[]: 0 0 17259 17259
[  185.767626] HighMem free:628kB min:512kB low:2836kB high:5160kB active_anon:545256kB inactive_anon:136444kB active_file:2704kB inactive_file:1376kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2209236kB mlocked:0kB dirty:0kB writeback:1416kB mapped:3804kB shmem:671948kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1216kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no
[  185.773232] lowmem_reserve[]: 0 0 0 0
[  185.775142] DMA: 1*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3516kB
[  185.777152] Normal: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3444kB
[  185.779420] HighMem: 151*4kB 14*8kB 6*16kB 2*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 876kB
[  185.781487] 170219 total pagecache pages
[  185.783662] 1273 pages in swap cache
[  185.785764] Swap cache stats: add 295103, delete 293899, find 177724/191474
[  185.787782] Free swap  = 8072684kB
[  185.789940] Total swap = 8385376kB
[  185.802225] 783968 pages RAM
[  185.804258] 556658 pages HighMem
[  185.806241] 7853 pages reserved
[  185.808263] 14784 pages shared
[  185.810237] 764178 pages non-shared
[  185.812216] iwl3945 0000:05:00.0: Failed to allocate SKB buffer with GFP_KERNEL. Only 5 free buffers remaining.
[  185.871861] iwl3945: page allocation failure. order:0, mode:0xd0
[  185.873847] Pid: 1225, comm: iwl3945 Not tainted 2.6.35.4-test-mm5v2-vmscan+snapshot-dirty #7
[  185.876345] Call Trace:
[  185.878416]  [<c019aa67>] ? __alloc_pages_nodemask+0x577/0x630
[  185.880493]  [<f8a562c5>] ? iwl3945_rx_allocate+0x75/0x240 [iwl3945]
[  185.882558]  [<f8a56d50>] ? iwl3945_rx_replenish+0x20/0x50 [iwl3945]
[  185.884651]  [<f8a56dbc>] ? iwl3945_bg_rx_replenish+0x3c/0x50 [iwl3945]
[  185.886753]  [<c014b167>] ? worker_thread+0x117/0x1f0
[  185.888831]  [<f8a56d80>] ? iwl3945_bg_rx_replenish+0x0/0x50 [iwl3945]
[  185.890856]  [<c014ebd0>] ? autoremove_wake_function+0x0/0x40
[  185.892984]  [<c014b050>] ? worker_thread+0x0/0x1f0
[  185.895116]  [<c014e854>] ? kthread+0x74/0x80
[  185.897249]  [<c014e7e0>] ? kthread+0x0/0x80
[  185.899367]  [<c0103cb6>] ? kernel_thread_helper+0x6/0x10
[  185.901465] Mem-Info:
[  185.903580] DMA per-cpu:
[  185.905697] CPU    0: hi:    0, btch:   1 usd:   0
[  185.907801] CPU    1: hi:    0, btch:   1 usd:   0
[  185.909915] Normal per-cpu:
[  185.912014] CPU    0: hi:  186, btch:  31 usd: 184
[  185.914144] CPU    1: hi:  186, btch:  31 usd:  71
[  185.916228] HighMem per-cpu:
[  185.918285] CPU    0: hi:  186, btch:  31 usd: 164
[  185.920370] CPU    1: hi:  186, btch:  31 usd: 156
[  185.922390] active_anon:134202 inactive_anon:33609 isolated_anon:0
[  185.922391]  active_file:689 inactive_file:344 isolated_file:0
[  185.922393]  unevictable:0 dirty:0 writeback:409 unstable:0
[  185.922394]  free:1897 slab_reclaimable:933 slab_unreclaimable:1678
[  185.922395]  mapped:952 shmem:165367 pagetables:304 bounce:0
[  185.931250] DMA free:3516kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15804kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  185.936496] lowmem_reserve[]: 0 865 3022 3022
[  185.938184] Normal free:3568kB min:3728kB low:4660kB high:5592kB active_anon:0kB inactive_anon:0kB active_file:52kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:885944kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:3732kB slab_unreclaimable:6772kB kernel_stack:1480kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  185.943403] lowmem_reserve[]: 0 0 17259 17259
[  185.945089] HighMem free:1000kB min:512kB low:2836kB high:5160kB active_anon:536552kB inactive_anon:134300kB active_file:2704kB inactive_file:1376kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2209236kB mlocked:0kB dirty:0kB writeback:1620kB mapped:3804kB shmem:660876kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1216kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  185.950324] lowmem_reserve[]: 0 0 0 0
[  185.952105] DMA: 1*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3516kB
[  185.953979] Normal: 20*4kB 4*8kB 2*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3568kB
[  185.955929] HighMem: 176*4kB 23*8kB 3*16kB 2*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1000kB
[  185.957877] 167673 total pagecache pages
[  185.959787] 1391 pages in swap cache
[  185.961686] Swap cache stats: add 297661, delete 296270, find 177724/191474
[  185.963640] Free swap  = 8062924kB
[  185.965591] Total swap = 8385376kB
[  185.975363] 783968 pages RAM
[  185.977245] 556658 pages HighMem
[  185.979082] 7853 pages reserved
[  185.980889] 14760 pages shared
[  185.982676] 764093 pages non-shared
[  185.984478] iwl3945 0000:05:00.0: Failed to allocate SKB buffer with GFP_KERNEL. Only 4 free buffers remaining.
...
...
[  245.712056] usb usb3: __pm_runtime_suspend() from workqueue!
[  245.712088] usb usb3: __pm_runtime_suspend() returns 0!
[  245.712093] usb usb5: __pm_runtime_suspend() from workqueue!
[  245.712114] usb usb5: __pm_runtime_suspend() returns 0!
[  245.712119] usb usb1: __pm_runtime_suspend() from workqueue!
[  245.712165] usb usb1: __pm_runtime_suspend() returns 0!
[  265.583404] PM: Hibernation mode set to 'reboot'
[  265.584286] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[  265.584291] PM: Basic memory bitmaps created
[  265.584292] PM: Syncing filesystems ... done.
[  265.630328] Freezing user space processes ... (elapsed 0.01 seconds) done.
[  265.644327] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[  265.657301] PM: Preallocating image memory... 
[  277.545121] iwl3945: page allocation failure. order:0, mode:0xd0
[  277.547985] Pid: 1144, comm: iwl3945 Not tainted 2.6.35.4-test-mm5v2-vmscan+snapshot-dirty #7
[  277.547988] Call Trace:
[  277.548001]  [<c019aa67>] ? __alloc_pages_nodemask+0x577/0x630
[  277.548015]  [<f897d2c5>] ? iwl3945_rx_allocate+0x75/0x240 [iwl3945]
[  277.548019]  [<c03f0516>] ? schedule+0x356/0x730
[  277.548024]  [<f897dd50>] ? iwl3945_rx_replenish+0x20/0x50 [iwl3945]
[  277.548029]  [<f897ddbc>] ? iwl3945_bg_rx_replenish+0x3c/0x50 [iwl3945]
[  277.548033]  [<c014b167>] ? worker_thread+0x117/0x1f0
[  277.548038]  [<f897dd80>] ? iwl3945_bg_rx_replenish+0x0/0x50 [iwl3945]
[  277.548042]  [<c014ebd0>] ? autoremove_wake_function+0x0/0x40
[  277.548045]  [<c014b050>] ? worker_thread+0x0/0x1f0
[  277.548048]  [<c014e854>] ? kthread+0x74/0x80
[  277.548050]  [<c014e7e0>] ? kthread+0x0/0x80
[  277.548054]  [<c0103cb6>] ? kernel_thread_helper+0x6/0x10
[  277.548056] Mem-Info:
[  277.548059] DMA per-cpu:
[  277.548061] CPU    0: hi:    0, btch:   1 usd:   0
[  277.548063] CPU    1: hi:    0, btch:   1 usd:   0
[  277.548065] Normal per-cpu:
[  277.548067] CPU    0: hi:  186, btch:  31 usd: 158
[  277.548069] CPU    1: hi:  186, btch:  31 usd:  81
[  277.548070] HighMem per-cpu:
[  277.548072] CPU    0: hi:  186, btch:  31 usd: 194
[  277.548074] CPU    1: hi:  186, btch:  31 usd: 185
[  277.548079] active_anon:132061 inactive_anon:33051 isolated_anon:0
[  277.548080]  active_file:890 inactive_file:320 isolated_file:0
[  277.548081]  unevictable:0 dirty:0 writeback:282 unstable:0
[  277.548082]  free:1884 slab_reclaimable:880 slab_unreclaimable:1633
[  277.548084]  mapped:991 shmem:164563 pagetables:306 bounce:0
[  277.548090] DMA free:3520kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15804kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  277.548094] lowmem_reserve[]: 0 865 3022 3022
[  277.548102] Normal free:3608kB min:3728kB low:4660kB high:5592kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:76kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:885944kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:3520kB slab_unreclaimable:6524kB kernel_stack:1480kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:122 all_unreclaimable? no
[  277.548106] lowmem_reserve[]: 0 0 17259 17259
[  277.548114] HighMem free:408kB min:512kB low:2836kB high:5160kB active_anon:528244kB inactive_anon:132204kB active_file:3560kB inactive_file:1204kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2209236kB mlocked:0kB dirty:0kB writeback:1128kB mapped:3960kB shmem:658252kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1224kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4384 all_unreclaimable? no
[  277.548118] lowmem_reserve[]: 0 0 0 0
[  277.548121] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3520kB
[  277.548129] Normal: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3608kB
[  277.548136] HighMem: 100*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 408kB
[  277.548143] 166232 total pagecache pages
[  277.548145] 460 pages in swap cache
[  277.548147] Swap cache stats: add 947081, delete 946621, find 437341/491917
[  277.548148] Free swap  = 2665828kB
[  277.548150] Total swap = 2996084kB
[  277.560847] 783968 pages RAM
[  277.560849] 556658 pages HighMem
[  277.560851] 7853 pages reserved
[  277.560852] 14928 pages shared
[  277.560854] 764721 pages non-shared
[  277.560860] iwl3945 0000:05:00.0: Failed to allocate SKB buffer with GFP_KERNEL. Only 8 free buffers remaining.
[  277.560897] iwl3945: page allocation failure. order:0, mode:0xd0
[  277.560901] Pid: 1144, comm: iwl3945 Not tainted 2.6.35.4-test-mm5v2-vmscan+snapshot-dirty #7
[  277.560903] Call Trace:
[  277.560912]  [<c019aa67>] ? __alloc_pages_nodemask+0x577/0x630
[  277.560925]  [<f897d2c5>] ? iwl3945_rx_allocate+0x75/0x240 [iwl3945]
[  277.560930]  [<c014b815>] ? __queue_work+0x25/0x40
[  277.560933]  [<c014b908>] ? queue_work_on+0x38/0x60
[  277.560937]  [<f897dd50>] ? iwl3945_rx_replenish+0x20/0x50 [iwl3945]
[  277.560942]  [<f897ddbc>] ? iwl3945_bg_rx_replenish+0x3c/0x50 [iwl3945]
[  277.560945]  [<c014b167>] ? worker_thread+0x117/0x1f0
[  277.560949]  [<f897dd80>] ? iwl3945_bg_rx_replenish+0x0/0x50 [iwl3945]
[  277.560953]  [<c014ebd0>] ? autoremove_wake_function+0x0/0x40
[  277.560956]  [<c014b050>] ? worker_thread+0x0/0x1f0
[  277.560959]  [<c014e854>] ? kthread+0x74/0x80
[  277.560962]  [<c014e7e0>] ? kthread+0x0/0x80
[  277.560966]  [<c0103cb6>] ? kernel_thread_helper+0x6/0x10
[  277.560968] Mem-Info:
[  277.560970] DMA per-cpu:
[  277.560972] CPU    0: hi:    0, btch:   1 usd:   0
[  277.560975] CPU    1: hi:    0, btch:   1 usd:   0
[  277.560976] Normal per-cpu:
[  277.560978] CPU    0: hi:  186, btch:  31 usd: 159
[  277.560980] CPU    1: hi:  186, btch:  31 usd:  81
[  277.560982] HighMem per-cpu:
[  277.560984] CPU    0: hi:  186, btch:  31 usd: 194
[  277.560986] CPU    1: hi:  186, btch:  31 usd: 185
[  277.560991] active_anon:132061 inactive_anon:33051 isolated_anon:0
[  277.560992]  active_file:890 inactive_file:320 isolated_file:0
[  277.560994]  unevictable:0 dirty:0 writeback:168 unstable:0
[  277.560995]  free:1884 slab_reclaimable:880 slab_unreclaimable:1633
[  277.560996]  mapped:991 shmem:164563 pagetables:306 bounce:0
[  277.561003] DMA free:3520kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15804kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  277.561007] lowmem_reserve[]: 0 865 3022 3022
[  277.561014] Normal free:3608kB min:3728kB low:4660kB high:5592kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:76kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:885944kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:3520kB slab_unreclaimable:6524kB kernel_stack:1480kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:122 all_unreclaimable? no
[  277.561019] lowmem_reserve[]: 0 0 17259 17259
[  277.561027] HighMem free:408kB min:512kB low:2836kB high:5160kB active_anon:528244kB inactive_anon:132204kB active_file:3560kB inactive_file:1204kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2209236kB mlocked:0kB dirty:0kB writeback:672kB mapped:3960kB shmem:658252kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1224kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4384 all_unreclaimable? no
[  277.561031] lowmem_reserve[]: 0 0 0 0
[  277.561034] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3520kB
[  277.561042] Normal: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3608kB
[  277.561049] HighMem: 100*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 408kB
[  277.561057] 166232 total pagecache pages
[  277.561059] 460 pages in swap cache
[  277.561061] Swap cache stats: add 947081, delete 946621, find 437341/491917
[  277.561063] Free swap  = 2665828kB
[  277.561065] Total swap = 2996084kB
[  277.573331] 783968 pages RAM
[  277.573333] 556658 pages HighMem
[  277.573334] 7853 pages reserved
[  277.573335] 14928 pages shared
[  277.573336] 764720 pages non-shared
[  277.573341] iwl3945 0000:05:00.0: Failed to allocate SKB buffer with GFP_KERNEL. Only 8 free buffers remaining.
=== >8 ===

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Important news regarding the two different patches
       [not found]                           ` <4C8596B1.6010005@superonline.com>
@ 2010-09-07  1:58                             ` KOSAKI Motohiro
       [not found]                             ` <20100907104218.C8EF.A69D9226@jp.fujitsu.com>
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2010-09-07  1:58 UTC (permalink / raw)
  To: M. Vefa Bicakci; +Cc: linux-pm, Linux Kernel Mailing List, kosaki.motohiro

> Hello,
> 
> When I apply both of the patches, then I don't get any hangs with
> hibernation. However, I do get another problem, which I am not sure
> is related or not. I should note that I haven't experienced this
> with only the vmscan.c patch, but maybe I haven't repeated my test
> enough times.
> 
> One test consists of an automated run of 7 hibernate/thaw cycles. 
> 
> Here's what I got in dmesg in two of the iterations in one test.
> Sorry for the long e-mail and the long lines.
> 
> === 8< ===
> [  166.512085] PM: Hibernation mode set to 'reboot'
> [  166.516503] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
> [  166.517654] PM: Basic memory bitmaps created
> [  166.518781] PM: Syncing filesystems ... done.
> [  166.546308] Freezing user space processes ... (elapsed 0.01 seconds) done.
> [  166.559596] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> [  166.571649] PM: Preallocating image memory... 
> [  185.712457] iwl3945: page allocation failure. order:0, mode:0xd0
> [  185.714564] Pid: 1225, comm: iwl3945 Not tainted 2.6.35.4-test-mm5v2-vmscan+snapshot-dirty #7
> [  185.715741] Call Trace:
> [  185.716853]  [<c019aa67>] ? __alloc_pages_nodemask+0x577/0x630
> [  185.718126]  [<f8a562c5>] ? iwl3945_rx_allocate+0x75/0x240 [iwl3945]
> [  185.719379]  [<c03f0516>] ? schedule+0x356/0x730
> [  185.720556]  [<f8a56d50>] ? iwl3945_rx_replenish+0x20/0x50 [iwl3945]
> [  185.721914]  [<f8a56dbc>] ? iwl3945_bg_rx_replenish+0x3c/0x50 [iwl3945]
> [  185.723929]  [<c014b167>] ? worker_thread+0x117/0x1f0
> [  185.725745]  [<f8a56d80>] ? iwl3945_bg_rx_replenish+0x0/0x50 [iwl3945]
> [  185.727097]  [<c014ebd0>] ? autoremove_wake_function+0x0/0x40
> [  185.728468]  [<c014b050>] ? worker_thread+0x0/0x1f0
> [  185.730235]  [<c014e854>] ? kthread+0x74/0x80
> [  185.731601]  [<c014e7e0>] ? kthread+0x0/0x80
> [  185.732919]  [<c0103cb6>] ? kernel_thread_helper+0x6/0x10

Hm, interesting.

Rafael's patch seems works intentionally. preallocate much much memory and
release over allocated memory. But on your system, iwl3945 allocate memory 
concurrently. If it try to allocate before the hibernation code release 
extra memory, It may get allocation failure.

So, I'm not sure wich behavior is desired.
  1) preallocate enough much memory
	pros) hibernate faster
	cons) failure risk of network card memory allocation
  2) preallocate small memory
	pros) hibernate slower
	cons) don't makes network card memory allocation

But, I wonder why this kernel thread is not frozen. afaik, hibernation
doesn't need network capability. Is this really intentional?

Rafael, Could you please explain the design of hibernation and your
intention?

Vefa, note: this allocation failure doesn't makes any problem. this mean
network card can't receive one network packet. But while hibernation,
we always can't receive network patchet. so no problem.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Important news regarding the two different patches
       [not found]                             ` <20100907104218.C8EF.A69D9226@jp.fujitsu.com>
@ 2010-09-07 21:44                               ` Rafael J. Wysocki
       [not found]                               ` <201009072344.53905.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-07 21:44 UTC (permalink / raw)
  To: KOSAKI Motohiro, M. Vefa Bicakci; +Cc: linux-pm, Linux Kernel Mailing List

On Tuesday, September 07, 2010, KOSAKI Motohiro wrote:
> > Hello,
> > 
> > When I apply both of the patches, then I don't get any hangs with
> > hibernation. However, I do get another problem, which I am not sure
> > is related or not. I should note that I haven't experienced this
> > with only the vmscan.c patch, but maybe I haven't repeated my test
> > enough times.
> > 
> > One test consists of an automated run of 7 hibernate/thaw cycles. 
> > 
> > Here's what I got in dmesg in two of the iterations in one test.
> > Sorry for the long e-mail and the long lines.
> > 
> > === 8< ===
> > [  166.512085] PM: Hibernation mode set to 'reboot'
> > [  166.516503] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
> > [  166.517654] PM: Basic memory bitmaps created
> > [  166.518781] PM: Syncing filesystems ... done.
> > [  166.546308] Freezing user space processes ... (elapsed 0.01 seconds) done.
> > [  166.559596] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> > [  166.571649] PM: Preallocating image memory... 
> > [  185.712457] iwl3945: page allocation failure. order:0, mode:0xd0
> > [  185.714564] Pid: 1225, comm: iwl3945 Not tainted 2.6.35.4-test-mm5v2-vmscan+snapshot-dirty #7
> > [  185.715741] Call Trace:
> > [  185.716853]  [<c019aa67>] ? __alloc_pages_nodemask+0x577/0x630
> > [  185.718126]  [<f8a562c5>] ? iwl3945_rx_allocate+0x75/0x240 [iwl3945]
> > [  185.719379]  [<c03f0516>] ? schedule+0x356/0x730
> > [  185.720556]  [<f8a56d50>] ? iwl3945_rx_replenish+0x20/0x50 [iwl3945]
> > [  185.721914]  [<f8a56dbc>] ? iwl3945_bg_rx_replenish+0x3c/0x50 [iwl3945]
> > [  185.723929]  [<c014b167>] ? worker_thread+0x117/0x1f0
> > [  185.725745]  [<f8a56d80>] ? iwl3945_bg_rx_replenish+0x0/0x50 [iwl3945]
> > [  185.727097]  [<c014ebd0>] ? autoremove_wake_function+0x0/0x40
> > [  185.728468]  [<c014b050>] ? worker_thread+0x0/0x1f0
> > [  185.730235]  [<c014e854>] ? kthread+0x74/0x80
> > [  185.731601]  [<c014e7e0>] ? kthread+0x0/0x80
> > [  185.732919]  [<c0103cb6>] ? kernel_thread_helper+0x6/0x10
> 
> Hm, interesting.
> 
> Rafael's patch seems works intentionally. preallocate much much memory and
> release over allocated memory. But on your system, iwl3945 allocate memory 
> concurrently. If it try to allocate before the hibernation code release 
> extra memory, It may get allocation failure.
> 
> So, I'm not sure wich behavior is desired.
>   1) preallocate enough much memory
> 	pros) hibernate faster
> 	cons) failure risk of network card memory allocation
>   2) preallocate small memory
> 	pros) hibernate slower
> 	cons) don't makes network card memory allocation
> 
> But, I wonder why this kernel thread is not frozen. afaik, hibernation
> doesn't need network capability. Is this really intentional?

It's a kernel thread, we don't freeze them by default, only the ones that
directly request to be frozen.

BTW, please note that the card probably allocates from normal zone and that
may be the reason of the failure.

> Rafael, Could you please explain the design of hibernation and your
> intention?

The design of the preallocator is pretty straightforward.

First, if there's already enough free memory to make a copy of all memory in
use, we simply allocate as much memory as needed for that copy and return
(the size >= saveable condition).

Next, we preallocate as much memory as to accommodate the largest possible
image.  A little more than 50% of RAM is preallocated in this step (this causes
some pages that were in use before to be freed, so the resulting image size is
a little below 50% of RAM).

Next, there is the sysfs file /sys/power/image_size that represents the user's
desired size of the image.  If this number is much less than 50% of RAM,
we do our best to force the mm subsystem to free more pages so that the
resulting image size is possibly close to the desired one.  So, I guess, if
Vefa writes a greater number into /sys/power/image_size (this is in bytes),
the problems should go away. :-)

Still, I see a way to improve things in my patch.  Namely, I guess the number
returned by minimum_image_size() may also be regarded as the number of
non-highmem pages we can't free with good approximation.  Thus the
second argument of preallocate_image_memory() should be
size_normal - "the number returned by minimum_image_size()".

[BTW, there seems to be a bug in minimum_image_size(), because if
saveable < size, this means that the minimum image size is equal to saveable
rather than 0.  This shouldn't happen, though.]

Vefa, can you please test the patch below with and without the
patch at http://lkml.org/lkml/2010/9/5/86 (please don't try to change
/sys/power/image_size yet)?

Thanks,
Rafael


---
 kernel/power/snapshot.c |   75 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 55 insertions(+), 20 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1122,9 +1122,19 @@ static unsigned long preallocate_image_p
 	return nr_alloc;
 }
 
-static unsigned long preallocate_image_memory(unsigned long nr_pages)
+static unsigned long preallocate_image_memory(unsigned long nr_pages,
+					      unsigned long avail_normal)
 {
-	return preallocate_image_pages(nr_pages, GFP_IMAGE);
+	unsigned long alloc;
+
+	if (avail_normal <= alloc_normal)
+		return 0;
+
+	alloc = avail_normal - alloc_normal;
+	if (nr_pages < alloc)
+		alloc = nr_pages;
+
+	return preallocate_image_pages(alloc, GFP_IMAGE);
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1170,15 +1180,22 @@ static inline unsigned long preallocate_
  */
 static void free_unnecessary_pages(void)
 {
-	unsigned long save_highmem, to_free_normal, to_free_highmem;
+	unsigned long save, to_free_normal, to_free_highmem;
 
-	to_free_normal = alloc_normal - count_data_pages();
-	save_highmem = count_highmem_pages();
-	if (alloc_highmem > save_highmem) {
-		to_free_highmem = alloc_highmem - save_highmem;
+	save = count_data_pages();
+	if (alloc_normal >= save) {
+		to_free_normal = alloc_normal - save;
+		save = 0;
+	} else {
+		to_free_normal = 0;
+		save -= alloc_normal;
+	}
+	save += count_highmem_pages();
+	if (alloc_highmem >= save) {
+		to_free_highmem = alloc_highmem - save;
 	} else {
 		to_free_highmem = 0;
-		to_free_normal -= save_highmem - alloc_highmem;
+		to_free_normal -= save - alloc_highmem;
 	}
 
 	memory_bm_position_reset(&copy_bm);
@@ -1259,7 +1276,7 @@ int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, highmem, pages = 0;
-	unsigned long alloc, save_highmem, pages_highmem;
+	unsigned long alloc, save_highmem, pages_highmem, avail_normal;
 	struct timeval start, stop;
 	int error;
 
@@ -1296,6 +1313,7 @@ int hibernate_preallocate_memory(void)
 		else
 			count += zone_page_state(zone, NR_FREE_PAGES);
 	}
+	avail_normal = count;
 	count += highmem;
 	count -= totalreserve_pages;
 
@@ -1310,12 +1328,21 @@ int hibernate_preallocate_memory(void)
 	 */
 	if (size >= saveable) {
 		pages = preallocate_image_highmem(save_highmem);
-		pages += preallocate_image_memory(saveable - pages);
+		pages += preallocate_image_memory(saveable - pages, avail_normal);
 		goto out;
 	}
 
 	/* Estimate the minimum size of the image. */
 	pages = minimum_image_size(saveable);
+	/*
+	 * To avoid excessive pressure on the normal zone, leave room in it to
+	 * accommodate the image of the minimum size (unless it's already too
+	 * small, in which case don't preallocate pages from it at all).
+	 */
+	if (avail_normal > pages)
+		avail_normal -= pages;
+	else
+		avail_normal = 0;
 	if (size < pages)
 		size = min_t(unsigned long, pages, max_size);
 
@@ -1336,16 +1363,24 @@ int hibernate_preallocate_memory(void)
 	 */
 	pages_highmem = preallocate_image_highmem(highmem / 2);
 	alloc = (count - max_size) - pages_highmem;
-	pages = preallocate_image_memory(alloc);
-	if (pages < alloc)
-		goto err_out;
-	size = max_size - size;
-	alloc = size;
-	size = preallocate_highmem_fraction(size, highmem, count);
-	pages_highmem += size;
-	alloc -= size;
-	pages += preallocate_image_memory(alloc);
-	pages += pages_highmem;
+	pages = preallocate_image_memory(alloc, avail_normal);
+	if (pages < alloc) {
+		/* We have exhausted non-highmem pages, try highmem. */
+		alloc -= pages;
+		pages = preallocate_image_highmem(alloc);
+		if (pages < alloc)
+			goto err_out;
+		pages += preallocate_image_highmem(max_size - size);
+	} else {
+		size = max_size - size;
+		alloc = size;
+		size = preallocate_highmem_fraction(size, highmem, count);
+		pages_highmem += size;
+		alloc -= size;
+		size = preallocate_image_memory(alloc, avail_normal);
+		pages_highmem += preallocate_image_highmem(alloc - size);
+		pages += pages_highmem + size;
+	}
 
 	/*
 	 * We only need as many page frames for the image as there are saveable

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Important news regarding the two different patches
       [not found]                               ` <201009072344.53905.rjw@sisk.pl>
@ 2010-09-08 12:56                                 ` M. Vefa Bicakci
       [not found]                                 ` <4C878818.1070705@superonline.com>
  1 sibling, 0 replies; 24+ messages in thread
From: M. Vefa Bicakci @ 2010-09-08 12:56 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On 07/09/10 05:44 PM, Rafael J. Wysocki wrote:
> On Tuesday, September 07, 2010, KOSAKI Motohiro wrote:
>>> [snip - M. Vefa Bicakci's last e-mail]
>>
>> Hm, interesting.
>>
>> Rafael's patch seems works intentionally. preallocate much much memory and
>> release over allocated memory. But on your system, iwl3945 allocate memory 
>> concurrently. If it try to allocate before the hibernation code release 
>> extra memory, It may get allocation failure.
>>
>> So, I'm not sure wich behavior is desired.
>>   1) preallocate enough much memory
>> 	pros) hibernate faster
>> 	cons) failure risk of network card memory allocation
>>   2) preallocate small memory
>> 	pros) hibernate slower
>> 	cons) don't makes network card memory allocation
>>
>> But, I wonder why this kernel thread is not frozen. afaik, hibernation
>> doesn't need network capability. Is this really intentional?
> 
> It's a kernel thread, we don't freeze them by default, only the ones that
> directly request to be frozen.
> 
> BTW, please note that the card probably allocates from normal zone and that
> may be the reason of the failure.
> 
>> Rafael, Could you please explain the design of hibernation and your
>> intention?
> 
> The design of the preallocator is pretty straightforward.
> 
> First, if there's already enough free memory to make a copy of all memory in
> use, we simply allocate as much memory as needed for that copy and return
> (the size >= saveable condition).
> 
> Next, we preallocate as much memory as to accommodate the largest possible
> image.  A little more than 50% of RAM is preallocated in this step (this causes
> some pages that were in use before to be freed, so the resulting image size is
> a little below 50% of RAM).
> 
> Next, there is the sysfs file /sys/power/image_size that represents the user's
> desired size of the image.  If this number is much less than 50% of RAM,
> we do our best to force the mm subsystem to free more pages so that the
> resulting image size is possibly close to the desired one.  So, I guess, if
> Vefa writes a greater number into /sys/power/image_size (this is in bytes),
> the problems should go away. :-)
> 
> Still, I see a way to improve things in my patch.  Namely, I guess the number
> returned by minimum_image_size() may also be regarded as the number of
> non-highmem pages we can't free with good approximation.  Thus the
> second argument of preallocate_image_memory() should be
> size_normal - "the number returned by minimum_image_size()".
> 
> [BTW, there seems to be a bug in minimum_image_size(), because if
> saveable < size, this means that the minimum image size is equal to saveable
> rather than 0.  This shouldn't happen, though.]
> 
> Vefa, can you please test the patch below with and without the
> patch at http://lkml.org/lkml/2010/9/5/86 (please don't try to change
> /sys/power/image_size yet)?
> 
> Thanks,
> Rafael

Dear Rafael Wysocki,

I applied the patch below to a clean 2.6.35.4 tree and tested 6 hibernate/thaw
cycles consecutively. I am happy to report that it works properly.

Then I applied the patch at http://lkml.org/lkml/2010/9/5/86 (the "vmscan.c
patch") on top of the tree I used above, and I also ran 6 hibernate/thaw
cycles. Again, I am happy to report that this combination of patches also
works properly.

I should note a few things though,

1) I don't think I ever changed /sys/power/image_size, so we can rule out the
possibility of that option changing the results.

2) With the patch below, for the *first* hibernation operation, the computer
enters a "thoughtful" state without any disk activity for 6-8 (maybe 10)
seconds after printing "Preallocating image memory". It works properly after
the wait however.

3) For some reason, with the patch below by itself, or in combination with the
above-mentioned vmscan.c patch, I haven't seen any page allocation errors
regarding the iwl3945 driver. To be honest I am not sure why this change
occurred, but I think you might know.

4) I made sure that I was not being impatient with the previous snapshot.c
patch, so I tested that on its own once again, and I confirmed that hibernation
hangs with the older version of the snapshot.c patch.

I am very happy that we are getting closer to a solution. Please let me know
if there is anything I need to test further.

Regards,

M. Vefa Bicakci

> ---
>  kernel/power/snapshot.c |   75 +++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 55 insertions(+), 20 deletions(-)
> 
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -1122,9 +1122,19 @@ static unsigned long preallocate_image_p
>  	return nr_alloc;
>  }
>  
> -static unsigned long preallocate_image_memory(unsigned long nr_pages)
> +static unsigned long preallocate_image_memory(unsigned long nr_pages,
> +					      unsigned long avail_normal)
>  {
> -	return preallocate_image_pages(nr_pages, GFP_IMAGE);
> +	unsigned long alloc;
> +
> +	if (avail_normal <= alloc_normal)
> +		return 0;
> +
> +	alloc = avail_normal - alloc_normal;
> +	if (nr_pages < alloc)
> +		alloc = nr_pages;
> +
> +	return preallocate_image_pages(alloc, GFP_IMAGE);
>  }
>  
>  #ifdef CONFIG_HIGHMEM
> @@ -1170,15 +1180,22 @@ static inline unsigned long preallocate_
>   */
>  static void free_unnecessary_pages(void)
>  {
> -	unsigned long save_highmem, to_free_normal, to_free_highmem;
> +	unsigned long save, to_free_normal, to_free_highmem;
>  
> -	to_free_normal = alloc_normal - count_data_pages();
> -	save_highmem = count_highmem_pages();
> -	if (alloc_highmem > save_highmem) {
> -		to_free_highmem = alloc_highmem - save_highmem;
> +	save = count_data_pages();
> +	if (alloc_normal >= save) {
> +		to_free_normal = alloc_normal - save;
> +		save = 0;
> +	} else {
> +		to_free_normal = 0;
> +		save -= alloc_normal;
> +	}
> +	save += count_highmem_pages();
> +	if (alloc_highmem >= save) {
> +		to_free_highmem = alloc_highmem - save;
>  	} else {
>  		to_free_highmem = 0;
> -		to_free_normal -= save_highmem - alloc_highmem;
> +		to_free_normal -= save - alloc_highmem;
>  	}
>  
>  	memory_bm_position_reset(&copy_bm);
> @@ -1259,7 +1276,7 @@ int hibernate_preallocate_memory(void)
>  {
>  	struct zone *zone;
>  	unsigned long saveable, size, max_size, count, highmem, pages = 0;
> -	unsigned long alloc, save_highmem, pages_highmem;
> +	unsigned long alloc, save_highmem, pages_highmem, avail_normal;
>  	struct timeval start, stop;
>  	int error;
>  
> @@ -1296,6 +1313,7 @@ int hibernate_preallocate_memory(void)
>  		else
>  			count += zone_page_state(zone, NR_FREE_PAGES);
>  	}
> +	avail_normal = count;
>  	count += highmem;
>  	count -= totalreserve_pages;
>  
> @@ -1310,12 +1328,21 @@ int hibernate_preallocate_memory(void)
>  	 */
>  	if (size >= saveable) {
>  		pages = preallocate_image_highmem(save_highmem);
> -		pages += preallocate_image_memory(saveable - pages);
> +		pages += preallocate_image_memory(saveable - pages, avail_normal);
>  		goto out;
>  	}
>  
>  	/* Estimate the minimum size of the image. */
>  	pages = minimum_image_size(saveable);
> +	/*
> +	 * To avoid excessive pressure on the normal zone, leave room in it to
> +	 * accommodate the image of the minimum size (unless it's already too
> +	 * small, in which case don't preallocate pages from it at all).
> +	 */
> +	if (avail_normal > pages)
> +		avail_normal -= pages;
> +	else
> +		avail_normal = 0;
>  	if (size < pages)
>  		size = min_t(unsigned long, pages, max_size);
>  
> @@ -1336,16 +1363,24 @@ int hibernate_preallocate_memory(void)
>  	 */
>  	pages_highmem = preallocate_image_highmem(highmem / 2);
>  	alloc = (count - max_size) - pages_highmem;
> -	pages = preallocate_image_memory(alloc);
> -	if (pages < alloc)
> -		goto err_out;
> -	size = max_size - size;
> -	alloc = size;
> -	size = preallocate_highmem_fraction(size, highmem, count);
> -	pages_highmem += size;
> -	alloc -= size;
> -	pages += preallocate_image_memory(alloc);
> -	pages += pages_highmem;
> +	pages = preallocate_image_memory(alloc, avail_normal);
> +	if (pages < alloc) {
> +		/* We have exhausted non-highmem pages, try highmem. */
> +		alloc -= pages;
> +		pages = preallocate_image_highmem(alloc);
> +		if (pages < alloc)
> +			goto err_out;
> +		pages += preallocate_image_highmem(max_size - size);
> +	} else {
> +		size = max_size - size;
> +		alloc = size;
> +		size = preallocate_highmem_fraction(size, highmem, count);
> +		pages_highmem += size;
> +		alloc -= size;
> +		size = preallocate_image_memory(alloc, avail_normal);
> +		pages_highmem += preallocate_image_highmem(alloc - size);
> +		pages += pages_highmem + size;
> +	}
>  
>  	/*
>  	 * We only need as many page frames for the image as there are saveable
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] PM / Hibernate: Avoid hitting OOM during preallocation of memory (was: Re: Important news ...)
       [not found]                                 ` <4C878818.1070705@superonline.com>
@ 2010-09-08 21:34                                   ` Rafael J. Wysocki
       [not found]                                   ` <201009082334.01255.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-08 21:34 UTC (permalink / raw)
  To: M. Vefa Bicakci; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On Wednesday, September 08, 2010, M. Vefa Bicakci wrote:
> On 07/09/10 05:44 PM, Rafael J. Wysocki wrote:
> > On Tuesday, September 07, 2010, KOSAKI Motohiro wrote:
> >>> [snip - M. Vefa Bicakci's last e-mail]
> >>
> >> Hm, interesting.
> >>
> >> Rafael's patch seems works intentionally. preallocate much much memory and
> >> release over allocated memory. But on your system, iwl3945 allocate memory 
> >> concurrently. If it try to allocate before the hibernation code release 
> >> extra memory, It may get allocation failure.
> >>
> >> So, I'm not sure wich behavior is desired.
> >>   1) preallocate enough much memory
> >> 	pros) hibernate faster
> >> 	cons) failure risk of network card memory allocation
> >>   2) preallocate small memory
> >> 	pros) hibernate slower
> >> 	cons) don't makes network card memory allocation
> >>
> >> But, I wonder why this kernel thread is not frozen. afaik, hibernation
> >> doesn't need network capability. Is this really intentional?
> > 
> > It's a kernel thread, we don't freeze them by default, only the ones that
> > directly request to be frozen.
> > 
> > BTW, please note that the card probably allocates from normal zone and that
> > may be the reason of the failure.
> > 
> >> Rafael, Could you please explain the design of hibernation and your
> >> intention?
> > 
> > The design of the preallocator is pretty straightforward.
> > 
> > First, if there's already enough free memory to make a copy of all memory in
> > use, we simply allocate as much memory as needed for that copy and return
> > (the size >= saveable condition).
> > 
> > Next, we preallocate as much memory as to accommodate the largest possible
> > image.  A little more than 50% of RAM is preallocated in this step (this causes
> > some pages that were in use before to be freed, so the resulting image size is
> > a little below 50% of RAM).
> > 
> > Next, there is the sysfs file /sys/power/image_size that represents the user's
> > desired size of the image.  If this number is much less than 50% of RAM,
> > we do our best to force the mm subsystem to free more pages so that the
> > resulting image size is possibly close to the desired one.  So, I guess, if
> > Vefa writes a greater number into /sys/power/image_size (this is in bytes),
> > the problems should go away. :-)
> > 
> > Still, I see a way to improve things in my patch.  Namely, I guess the number
> > returned by minimum_image_size() may also be regarded as the number of
> > non-highmem pages we can't free with good approximation.  Thus the
> > second argument of preallocate_image_memory() should be
> > size_normal - "the number returned by minimum_image_size()".
> > 
> > [BTW, there seems to be a bug in minimum_image_size(), because if
> > saveable < size, this means that the minimum image size is equal to saveable
> > rather than 0.  This shouldn't happen, though.]
> > 
> > Vefa, can you please test the patch below with and without the
> > patch at http://lkml.org/lkml/2010/9/5/86 (please don't try to change
> > /sys/power/image_size yet)?
> > 
> > Thanks,
> > Rafael
> 
> Dear Rafael Wysocki,
> 
> I applied the patch below to a clean 2.6.35.4 tree and tested 6 hibernate/thaw
> cycles consecutively. I am happy to report that it works properly.
> 
> Then I applied the patch at http://lkml.org/lkml/2010/9/5/86 (the "vmscan.c
> patch") on top of the tree I used above, and I also ran 6 hibernate/thaw
> cycles. Again, I am happy to report that this combination of patches also
> works properly.

Great, that's encouraging.

> I should note a few things though,
> 
> 1) I don't think I ever changed /sys/power/image_size, so we can rule out the
> possibility of that option changing the results.

Can you please check what value is there in this file?

> 2) With the patch below, for the *first* hibernation operation, the computer
> enters a "thoughtful" state without any disk activity for 6-8 (maybe 10)
> seconds after printing "Preallocating image memory". It works properly after
> the wait however.

That probably is a result of spending time in the memory allocator trying to
reduce the size of the image as much as possible.

> 3) For some reason, with the patch below by itself, or in combination with the
> above-mentioned vmscan.c patch, I haven't seen any page allocation errors
> regarding the iwl3945 driver. To be honest I am not sure why this change
> occurred, but I think you might know.

I think we just keep enough free pages in the normal zone all the time for the
driver to allocate from.

> 4) I made sure that I was not being impatient with the previous snapshot.c
> patch, so I tested that on its own once again, and I confirmed that hibernation
> hangs with the older version of the snapshot.c patch.
> 
> I am very happy that we are getting closer to a solution. Please let me know
> if there is anything I need to test further.

Below is the patch I'd like to apply.  It should work just like the previous
one (there are a few fixes that shouldn't affect the functionality in it), but
please test it if you can.

I think the slowdown you saw in 2) may be eliminated by increasing the
image_size value, so I'm going to prepare a patch that will compute the
value automatically during boot so that it's approximately 50% of RAM.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Hibernate: Avoid hitting OOM during preallocation of memory

There is a problem in hibernate_preallocate_memory() that it calls
preallocate_image_memory() with an argument that may be greater than
the total number of available non-highmem memory pages.  If that's
the case, the OOM condition is guaranteed to trigger, which in turn
can cause significant slowdown to occur during hibernation.

To avoid that, make preallocate_image_memory() adjust its argument
before calling preallocate_image_pages(), so that the total number of
saveable non-highem pages left is not less than the minimum size of
a hibernation image.  Change hibernate_preallocate_memory() to try to
allocate from highmem if the number of pages allocated by
preallocate_image_memory() is too low.

Modify free_unnecessary_pages() to take all possible memory
allocation patterns into account.

Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/snapshot.c |   85 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 65 insertions(+), 20 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -1122,9 +1122,19 @@ static unsigned long preallocate_image_p
 	return nr_alloc;
 }
 
-static unsigned long preallocate_image_memory(unsigned long nr_pages)
+static unsigned long preallocate_image_memory(unsigned long nr_pages,
+					      unsigned long avail_normal)
 {
-	return preallocate_image_pages(nr_pages, GFP_IMAGE);
+	unsigned long alloc;
+
+	if (avail_normal <= alloc_normal)
+		return 0;
+
+	alloc = avail_normal - alloc_normal;
+	if (nr_pages < alloc)
+		alloc = nr_pages;
+
+	return preallocate_image_pages(alloc, GFP_IMAGE);
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1170,15 +1180,22 @@ static inline unsigned long preallocate_
  */
 static void free_unnecessary_pages(void)
 {
-	unsigned long save_highmem, to_free_normal, to_free_highmem;
+	unsigned long save, to_free_normal, to_free_highmem;
 
-	to_free_normal = alloc_normal - count_data_pages();
-	save_highmem = count_highmem_pages();
-	if (alloc_highmem > save_highmem) {
-		to_free_highmem = alloc_highmem - save_highmem;
+	save = count_data_pages();
+	if (alloc_normal >= save) {
+		to_free_normal = alloc_normal - save;
+		save = 0;
+	} else {
+		to_free_normal = 0;
+		save -= alloc_normal;
+	}
+	save += count_highmem_pages();
+	if (alloc_highmem >= save) {
+		to_free_highmem = alloc_highmem - save;
 	} else {
 		to_free_highmem = 0;
-		to_free_normal -= save_highmem - alloc_highmem;
+		to_free_normal -= save - alloc_highmem;
 	}
 
 	memory_bm_position_reset(&copy_bm);
@@ -1259,7 +1276,7 @@ int hibernate_preallocate_memory(void)
 {
 	struct zone *zone;
 	unsigned long saveable, size, max_size, count, highmem, pages = 0;
-	unsigned long alloc, save_highmem, pages_highmem;
+	unsigned long alloc, save_highmem, pages_highmem, avail_normal;
 	struct timeval start, stop;
 	int error;
 
@@ -1296,6 +1313,7 @@ int hibernate_preallocate_memory(void)
 		else
 			count += zone_page_state(zone, NR_FREE_PAGES);
 	}
+	avail_normal = count;
 	count += highmem;
 	count -= totalreserve_pages;
 
@@ -1310,12 +1328,21 @@ int hibernate_preallocate_memory(void)
 	 */
 	if (size >= saveable) {
 		pages = preallocate_image_highmem(save_highmem);
-		pages += preallocate_image_memory(saveable - pages);
+		pages += preallocate_image_memory(saveable - pages, avail_normal);
 		goto out;
 	}
 
 	/* Estimate the minimum size of the image. */
 	pages = minimum_image_size(saveable);
+	/*
+	 * To avoid excessive pressure on the normal zone, leave room in it to
+	 * accommodate an image of the minimum size (unless it's already too
+	 * small, in which case don't preallocate pages from it at all).
+	 */
+	if (avail_normal > pages)
+		avail_normal -= pages;
+	else
+		avail_normal = 0;
 	if (size < pages)
 		size = min_t(unsigned long, pages, max_size);
 
@@ -1336,16 +1363,34 @@ int hibernate_preallocate_memory(void)
 	 */
 	pages_highmem = preallocate_image_highmem(highmem / 2);
 	alloc = (count - max_size) - pages_highmem;
-	pages = preallocate_image_memory(alloc);
-	if (pages < alloc)
-		goto err_out;
-	size = max_size - size;
-	alloc = size;
-	size = preallocate_highmem_fraction(size, highmem, count);
-	pages_highmem += size;
-	alloc -= size;
-	pages += preallocate_image_memory(alloc);
-	pages += pages_highmem;
+	pages = preallocate_image_memory(alloc, avail_normal);
+	if (pages < alloc) {
+		/* We have exhausted non-highmem pages, try highmem. */
+		alloc -= pages;
+		pages += pages_highmem;
+		pages_highmem = preallocate_image_highmem(alloc);
+		if (pages_highmem < alloc)
+			goto err_out;
+		pages += pages_highmem;
+		/*
+		 * size is the desired number of saveable pages to leave in
+		 * memory, so try to preallocate (all memory - size) pages.
+		 */
+		alloc = (count - pages) - size;
+		pages += preallocate_image_highmem(alloc);
+	} else {
+		/*
+		 * There are approximately max_size saveable pages at this point
+		 * and we want to reduce this number down to size.
+		 */
+		alloc = max_size - size;
+		size = preallocate_highmem_fraction(alloc, highmem, count);
+		pages_highmem += size;
+		alloc -= size;
+		size = preallocate_image_memory(alloc, avail_normal);
+		pages_highmem += preallocate_image_highmem(alloc - size);
+		pages += pages_highmem + size;
+	}
 
 	/*
 	 * We only need as many page frames for the image as there are saveable

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PATCH: PM / Hibernate: Avoid hitting OOM during preallocationof memory
       [not found]                                   ` <201009082334.01255.rjw@sisk.pl>
@ 2010-09-11 18:12                                     ` M. Vefa Bicakci
       [not found]                                     ` <4C8BC68B.8050002@superonline.com>
  1 sibling, 0 replies; 24+ messages in thread
From: M. Vefa Bicakci @ 2010-09-11 18:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

Hello,

Sorry for the late reply. I have been busy the past few days.

On 08/09/10 05:34 PM, Rafael J. Wysocki wrote:
> On Wednesday, September 08, 2010, M. Vefa Bicakci wrote:
>> [snip]
>>
>> I should note a few things though,
>>
>> 1) I don't think I ever changed /sys/power/image_size, so we can rule out the
>> possibility of that option changing the results.
> 
> Can you please check what value is there in this file?

It contains 524288000, so I think it is set to 500 MB. I believe that this is
the default value, but I am not sure.

>> 2) With the patch below, for the *first* hibernation operation, the computer
>> enters a "thoughtful" state without any disk activity for 6-8 (maybe 10)
>> seconds after printing "Preallocating image memory". It works properly after
>> the wait however.
> 
> That probably is a result of spending time in the memory allocator trying to
> reduce the size of the image as much as possible.

I am not sure if this is a new thing with the new patch, but the behavior
seems to continue with the later hibernation operations too, not just the
first one. I haven't confirmed if I really didn't realize the problem in
the previous version of the patch, but it is very possible that I didn't
realize it since I used to automate my tests. (I didn't automate my tests
this time.)

However, considering that the kernel needs to worry about compacting 1500 MB
of data when hibernating with my tmpfs-is-full system, I guess these wait
times are normal, even though a bit inconvenient.

>> [snip]
> 
>> 4) I made sure that I was not being impatient with the previous snapshot.c
>> patch, so I tested that on its own once again, and I confirmed that hibernation
>> hangs with the older version of the snapshot.c patch.
>>
>> I am very happy that we are getting closer to a solution. Please let me know
>> if there is anything I need to test further.
> 
> Below is the patch I'd like to apply.  It should work just like the previous
> one (there are a few fixes that shouldn't affect the functionality in it), but
> please test it if you can.

I am happy to report that it works properly by only itself when applied to
a clean 2.6.35.4 tree. I haven't had any problems (aside from the "thoughtful
state" issue I mentioned above) with my 6 consecutive hibernation attempts.

> I think the slowdown you saw in 2) may be eliminated by increasing the
> image_size value, so I'm going to prepare a patch that will compute the
> value automatically during boot so that it's approximately 50% of RAM.

I would be glad to test that patch as well, to see if it brings speed-ups.
Actually, I might test hibernation with a larger value written to
/sys/power/image_size when I have time.

> 
> Thanks,
> Rafael

I really appreciate your help. Thanks a lot!

M. Vefa Bicakci

> [patch snipped]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PATCH: PM / Hibernate: Avoid hitting OOM during preallocationof memory
       [not found]                                     ` <4C8BC68B.8050002@superonline.com>
@ 2010-09-11 19:06                                       ` Rafael J. Wysocki
       [not found]                                       ` <201009112106.07687.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-11 19:06 UTC (permalink / raw)
  To: M. Vefa Bicakci; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On Saturday, September 11, 2010, M. Vefa Bicakci wrote:
> Hello,
> 
> Sorry for the late reply. I have been busy the past few days.
> 
> On 08/09/10 05:34 PM, Rafael J. Wysocki wrote:
> > On Wednesday, September 08, 2010, M. Vefa Bicakci wrote:
> >> [snip]
> >>
> >> I should note a few things though,
> >>
> >> 1) I don't think I ever changed /sys/power/image_size, so we can rule out the
> >> possibility of that option changing the results.
> > 
> > Can you please check what value is there in this file?
> 
> It contains 524288000, so I think it is set to 500 MB. I believe that this is
> the default value, but I am not sure.

Yes, this is the (hard coded) default, which very likely is to small for your system.

> >> 2) With the patch below, for the *first* hibernation operation, the computer
> >> enters a "thoughtful" state without any disk activity for 6-8 (maybe 10)
> >> seconds after printing "Preallocating image memory". It works properly after
> >> the wait however.
> > 
> > That probably is a result of spending time in the memory allocator trying to
> > reduce the size of the image as much as possible.
> 
> I am not sure if this is a new thing with the new patch, but the behavior
> seems to continue with the later hibernation operations too, not just the
> first one. I haven't confirmed if I really didn't realize the problem in
> the previous version of the patch, but it is very possible that I didn't
> realize it since I used to automate my tests. (I didn't automate my tests
> this time.)
> 
> However, considering that the kernel needs to worry about compacting 1500 MB
> of data when hibernating with my tmpfs-is-full system, I guess these wait
> times are normal, even though a bit inconvenient.
> 
> >> [snip]
> > 
> >> 4) I made sure that I was not being impatient with the previous snapshot.c
> >> patch, so I tested that on its own once again, and I confirmed that hibernation
> >> hangs with the older version of the snapshot.c patch.
> >>
> >> I am very happy that we are getting closer to a solution. Please let me know
> >> if there is anything I need to test further.
> > 
> > Below is the patch I'd like to apply.  It should work just like the previous
> > one (there are a few fixes that shouldn't affect the functionality in it), but
> > please test it if you can.
> 
> I am happy to report that it works properly by only itself when applied to
> a clean 2.6.35.4 tree. I haven't had any problems (aside from the "thoughtful
> state" issue I mentioned above) with my 6 consecutive hibernation attempts.

Great, thanks a lot for your excellent feedback!

> > I think the slowdown you saw in 2) may be eliminated by increasing the
> > image_size value, so I'm going to prepare a patch that will compute the
> > value automatically during boot so that it's approximately 50% of RAM.
> 
> I would be glad to test that patch as well, to see if it brings speed-ups.
> Actually, I might test hibernation with a larger value written to
> /sys/power/image_size when I have time.

I think that would improve things, as it probably is impossible to reduce the
image size to 500 MB on your system.

Anyway, I'll let you know when the patch is ready.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] PM / Hibernate: Make default image size depend on total RAM size (was: Re: PATCH: PM / Hibernate: Avoid hitting OOM ...)
       [not found]                                       ` <201009112106.07687.rjw@sisk.pl>
@ 2010-09-11 22:27                                         ` Rafael J. Wysocki
       [not found]                                         ` <201009120027.06470.rjw@sisk.pl>
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-11 22:27 UTC (permalink / raw)
  To: M. Vefa Bicakci; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On Saturday, September 11, 2010, Rafael J. Wysocki wrote:
> On Saturday, September 11, 2010, M. Vefa Bicakci wrote:
...
> > 
> > I would be glad to test that patch as well, to see if it brings speed-ups.
> > Actually, I might test hibernation with a larger value written to
> > /sys/power/image_size when I have time.
> 
> I think that would improve things, as it probably is impossible to reduce the
> image size to 500 MB on your system.
> 
> Anyway, I'll let you know when the patch is ready.

OK, please try the patch below on top of the previous one and see if it makes
hibernation run faster on your system.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Hibernate: Make default image size depend on total RAM size

The default hibernation image size is currently hard coded and euqal
to 500 MB, which is not a reasonable default on many contemporary
systems.  Make it equal 2/5 of the total RAM size (this is slightly
below the maximum, i.e. 1/2 of the total RAM size, and seems to be
generally suitable).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/interface.txt |    2 +-
 kernel/power/main.c               |    1 +
 kernel/power/power.h              |    9 ++++++++-
 kernel/power/snapshot.c           |    7 ++++++-
 4 files changed, 16 insertions(+), 3 deletions(-)

Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -46,7 +46,12 @@ static void swsusp_unset_page_forbidden(
  * size will not exceed N bytes, but if that is impossible, it will
  * try to create the smallest image possible.
  */
-unsigned long image_size = 500 * 1024 * 1024;
+unsigned long image_size;
+
+void __init hibernate_image_size_init(void)
+{
+	image_size = ((totalram_pages * 2) / 5) * PAGE_SIZE;
+}
 
 /* List of PBEs needed for restoring the pages that were allocated before
  * the suspend and included in the suspend image, but have also been
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -14,6 +14,9 @@ struct swsusp_info {
 } __attribute__((aligned(PAGE_SIZE)));
 
 #ifdef CONFIG_HIBERNATION
+/* kernel/power/snapshot.c */
+extern void __init hibernate_image_size_init(void);
+
 #ifdef CONFIG_ARCH_HIBERNATION_HEADER
 /* Maximum size of architecture specific data in a hibernation header */
 #define MAX_ARCH_HEADER_SIZE	(sizeof(struct new_utsname) + 4)
@@ -49,7 +52,11 @@ static inline char *check_image_kernel(s
 extern int hibernation_snapshot(int platform_mode);
 extern int hibernation_restore(int platform_mode);
 extern int hibernation_platform_enter(void);
-#endif
+
+#else /* !CONFIG_HIBERNATION */
+
+static inline void hibernate_image_size_init(void) {}
+#endif /* !CONFIG_HIBERNATION */
 
 extern int pfn_is_nosave(unsigned long);
 
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -321,6 +321,7 @@ static int __init pm_init(void)
 	int error = pm_start_workqueue();
 	if (error)
 		return error;
+	hibernate_image_size_init();
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/Documentation/power/interface.txt
===================================================================
--- linux-2.6.orig/Documentation/power/interface.txt
+++ linux-2.6/Documentation/power/interface.txt
@@ -57,7 +57,7 @@ smallest image possible.  In particular,
 suspend image will be as small as possible.
 
 Reading from this file will display the current image size limit, which
-is set to 500 MB by default.
+is set to 2/5 of available RAM by default.
 
 /sys/power/pm_trace controls the code which saves the last PM event point in
 the RTC across reboots, so that you can debug a machine that just hangs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PM / Hibernate: Make default image size depend on totalRAM size
       [not found]                                         ` <201009120027.06470.rjw@sisk.pl>
@ 2010-09-13 15:40                                           ` M. Vefa Bicakci
       [not found]                                           ` <4C8E45EB.8030408@superonline.com>
  1 sibling, 0 replies; 24+ messages in thread
From: M. Vefa Bicakci @ 2010-09-13 15:40 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Linux Kernel Mailing List, KOSAKI Motohiro

On 11/09/10 06:27 PM, Rafael J. Wysocki wrote:
> On Saturday, September 11, 2010, Rafael J. Wysocki wrote:
>> On Saturday, September 11, 2010, M. Vefa Bicakci wrote:
> ...
>>>
>>> I would be glad to test that patch as well, to see if it brings speed-ups.
>>> Actually, I might test hibernation with a larger value written to
>>> /sys/power/image_size when I have time.
>>
>> I think that would improve things, as it probably is impossible to reduce the
>> image size to 500 MB on your system.
>>
>> Anyway, I'll let you know when the patch is ready.
> 
> OK, please try the patch below on top of the previous one and see if it makes
> hibernation run faster on your system.

Dear Rafael Wysocki,

I think I have good news. I took a clean 2.6.35.4 tree, and first applied
the latest version of your larger snapshot.c patch, and then the patch you
appended to your final e-mail in this thread.

Here is a comparison of the timings from a kernel without your patch, and
one with it.

=== 8< ===
Sep 11 10:22:24 debian kernel: [  499.968989] PM: Allocated 2531300 kbytes in 52.66 seconds (48.06 MB/s)
Sep 11 10:44:08 debian kernel: [  764.379131] PM: Allocated 2531308 kbytes in 143.41 seconds (17.65 MB/s)
Sep 11 10:48:41 debian kernel: [  920.626386] PM: Allocated 2531300 kbytes in 66.44 seconds (38.09 MB/s)
Sep 11 10:53:37 debian kernel: [ 1092.919140] PM: Allocated 2531316 kbytes in 81.28 seconds (31.14 MB/s)
...
Sep 13 01:26:09 debian kernel: [   94.948054] PM: Allocated 1804008 kbytes in 28.72 seconds (62.81 MB/s)
Sep 13 01:29:58 debian kernel: [  176.678880] PM: Allocated 1803992 kbytes in 34.44 seconds (52.38 MB/s)
Sep 13 01:33:48 debian kernel: [  253.336405] PM: Allocated 1804000 kbytes in 27.35 seconds (65.95 MB/s)
=== >8 ===

I didn't have your latest patch applied on September 11, and it was applied
last night.

It looks like there is a good improvement. I think the data rates look
faster on Sept. 13 because the kernel spent less time "thinking" less
while compacting the memory image. (I don't think I have changed anything
in my configuration that could affect the data rates that much.)

Is it possible to have these patches applied to the 2.6.35 tree so that
the regression I reported is fixed? Should I e-mail Greg Kroah-Hartman
about this?

Once again, thank a lot to you, Kosaki Motohiro and Minchan Kim!

M. Vefa Bicakci

> [snip patch]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PM / Hibernate: Make default image size depend on totalRAM size
       [not found]                                           ` <4C8E45EB.8030408@superonline.com>
@ 2010-09-13 17:52                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-09-13 17:52 UTC (permalink / raw)
  To: M. Vefa Bicakci
  Cc: Greg Kroah-Hartman, Linux Kernel Mailing List, KOSAKI Motohiro,
	linux-pm, stable

On Monday, September 13, 2010, M. Vefa Bicakci wrote:
> On 11/09/10 06:27 PM, Rafael J. Wysocki wrote:
> > On Saturday, September 11, 2010, Rafael J. Wysocki wrote:
> >> On Saturday, September 11, 2010, M. Vefa Bicakci wrote:
> > ...
> >>>
> >>> I would be glad to test that patch as well, to see if it brings speed-ups.
> >>> Actually, I might test hibernation with a larger value written to
> >>> /sys/power/image_size when I have time.
> >>
> >> I think that would improve things, as it probably is impossible to reduce the
> >> image size to 500 MB on your system.
> >>
> >> Anyway, I'll let you know when the patch is ready.
> > 
> > OK, please try the patch below on top of the previous one and see if it makes
> > hibernation run faster on your system.
> 
> Dear Rafael Wysocki,
> 
> I think I have good news. I took a clean 2.6.35.4 tree, and first applied
> the latest version of your larger snapshot.c patch, and then the patch you
> appended to your final e-mail in this thread.
> 
> Here is a comparison of the timings from a kernel without your patch, and
> one with it.
> 
> === 8< ===
> Sep 11 10:22:24 debian kernel: [  499.968989] PM: Allocated 2531300 kbytes in 52.66 seconds (48.06 MB/s)
> Sep 11 10:44:08 debian kernel: [  764.379131] PM: Allocated 2531308 kbytes in 143.41 seconds (17.65 MB/s)
> Sep 11 10:48:41 debian kernel: [  920.626386] PM: Allocated 2531300 kbytes in 66.44 seconds (38.09 MB/s)
> Sep 11 10:53:37 debian kernel: [ 1092.919140] PM: Allocated 2531316 kbytes in 81.28 seconds (31.14 MB/s)
> ...
> Sep 13 01:26:09 debian kernel: [   94.948054] PM: Allocated 1804008 kbytes in 28.72 seconds (62.81 MB/s)
> Sep 13 01:29:58 debian kernel: [  176.678880] PM: Allocated 1803992 kbytes in 34.44 seconds (52.38 MB/s)
> Sep 13 01:33:48 debian kernel: [  253.336405] PM: Allocated 1804000 kbytes in 27.35 seconds (65.95 MB/s)
> === >8 ===
> 
> I didn't have your latest patch applied on September 11, and it was applied
> last night.
> 
> It looks like there is a good improvement. I think the data rates look
> faster on Sept. 13 because the kernel spent less time "thinking" less
> while compacting the memory image. (I don't think I have changed anything
> in my configuration that could affect the data rates that much.)

OK, thanks for testing and information.

> Is it possible to have these patches applied to the 2.6.35 tree so that
> the regression I reported is fixed?

The "snapshot.c" patch has just been included into the Linus' tree as

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6715045ddc7472a22be5e49d4047d2d89b391f45

> Should I e-mail Greg Kroah-Hartman about this?

and I've already told Greg that it should go into 2.6.35.y.

The second patch, however, only changes the default value of image_size, so it
is not -stable material.

As a workaround, you can change the init scripts on your system to set
/sys/power/image_size to the same value that's in it when the second patch is
applied.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-09-13 17:52 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20100830083704.5231.A69D9226@jp.fujitsu.com>
     [not found] ` <4C7BE25B.1020504@superonline.com>
2010-09-01  0:48   ` [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang KOSAKI Motohiro
     [not found]   ` <20100901093219.9744.A69D9226@jp.fujitsu.com>
2010-09-01 22:02     ` Rafael J. Wysocki
     [not found]     ` <201009020002.25113.rjw@sisk.pl>
2010-09-02  0:31       ` KOSAKI Motohiro
     [not found]       ` <20100902091010.D050.A69D9226@jp.fujitsu.com>
2010-09-02 19:57         ` Rafael J. Wysocki
     [not found]         ` <201009022157.18561.rjw@sisk.pl>
2010-09-02 20:24           ` Rafael J. Wysocki
2010-09-03  0:13             ` KOSAKI Motohiro
     [not found]             ` <20100903085756.B657.A69D9226@jp.fujitsu.com>
2010-09-03  1:07               ` Rafael J. Wysocki
2010-09-03  1:53                 ` KOSAKI Motohiro
     [not found]                 ` <20100903105216.B65C.A69D9226@jp.fujitsu.com>
2010-09-04  1:44                   ` Rafael J. Wysocki
     [not found]                   ` <201009040344.42342.rjw@sisk.pl>
2010-09-06  2:08                     ` KOSAKI Motohiro
     [not found]                     ` <20100906090528.C8A9.A69D9226@jp.fujitsu.com>
2010-09-06 11:27                       ` Important news regarding the two different patches M. Vefa Bicakci
     [not found]                       ` <4C84D02A.7070508@superonline.com>
2010-09-06 18:43                         ` Rafael J. Wysocki
     [not found]                         ` <201009062043.37819.rjw@sisk.pl>
2010-09-07  1:34                           ` M. Vefa Bicakci
     [not found]                           ` <4C8596B1.6010005@superonline.com>
2010-09-07  1:58                             ` KOSAKI Motohiro
     [not found]                             ` <20100907104218.C8EF.A69D9226@jp.fujitsu.com>
2010-09-07 21:44                               ` Rafael J. Wysocki
     [not found]                               ` <201009072344.53905.rjw@sisk.pl>
2010-09-08 12:56                                 ` M. Vefa Bicakci
     [not found]                                 ` <4C878818.1070705@superonline.com>
2010-09-08 21:34                                   ` [PATCH] PM / Hibernate: Avoid hitting OOM during preallocation of memory (was: Re: Important news ...) Rafael J. Wysocki
     [not found]                                   ` <201009082334.01255.rjw@sisk.pl>
2010-09-11 18:12                                     ` PATCH: PM / Hibernate: Avoid hitting OOM during preallocationof memory M. Vefa Bicakci
     [not found]                                     ` <4C8BC68B.8050002@superonline.com>
2010-09-11 19:06                                       ` Rafael J. Wysocki
     [not found]                                       ` <201009112106.07687.rjw@sisk.pl>
2010-09-11 22:27                                         ` [PATCH] PM / Hibernate: Make default image size depend on total RAM size (was: Re: PATCH: PM / Hibernate: Avoid hitting OOM ...) Rafael J. Wysocki
     [not found]                                         ` <201009120027.06470.rjw@sisk.pl>
2010-09-13 15:40                                           ` [PATCH] PM / Hibernate: Make default image size depend on totalRAM size M. Vefa Bicakci
     [not found]                                           ` <4C8E45EB.8030408@superonline.com>
2010-09-13 17:52                                             ` Rafael J. Wysocki
2010-09-06 18:46                       ` [Bisected Regression in 2.6.35] A full tmpfs filesystem causeshibernation to hang Rafael J. Wysocki
     [not found]                       ` <201009062046.43513.rjw@sisk.pl>
2010-09-06 19:54                         ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox