[RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path
@ 2012-05-11  8:35 Aaditya Kumar
  2012-05-11 21:26 ` KOSAKI Motohiro
  0 siblings, 1 reply; 5+ messages in thread
From: Aaditya Kumar @ 2012-05-11  8:35 UTC (permalink / raw)
  To: linux-kernel, kosaki.motohiro
  Cc: frank.rowand, tim.bird, takuzo.ohara, kan.iibuchi,
	kosaki.motohiro

Dear All,

 Commit 929bea7c714220fc76ce3f75bef9056477c28e74 seems to have broken the
 OOM invocation during memory hot unplug in low memory conditions. This commit
 modifies the 'un-reclaimabilty check'  for  giving up trying to
reclaim pages in the
direct reclaim path and invoke OOM  killer to be based on zone->unreclaimable.

 While doing memory offline, if the memory to be off lined spans almost all the
 node then the memory needed to migrate pages for off lining can not
be allocated
 from the node that is being off lined as all pages are now in ISOLATED state
 (and also free).

 Since most pages are would be free due to pages having been migrated,
kswapd would
 not balance the zones as all zone watermarks would be OK and so
zone->un_reclaimable
 flag which is currently only set by kswapd will NOT be set.

 If page allocator has been passed the zones from above node(being
offlined) then
 OOM killer will never be invoked for low memory conditions because
buddy allocator
 will not allocate ISOLATED pages and direct reclaim path will not
give up trying
 because zone->unreclaimable flag would not be set for zone(s) in node
being off lined
 and thus resulting in a system hang.

The above issue is reproducible when off lining memory in low memory
conditions on ARM
systems for Cortex-A9, but the issue should be architecture independent.

This patch fixes this BUG by updating zone->unreclaimable in direct
reclaim path also.
---
  mm/vmscan.c |   11 	8 +	3 -	0 !
 1 file changed, 8 insertions(+), 3 deletions(-)

Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1997,7 +1997,7 @@ restart:
  * scan then give up on it.
  */
 static void shrink_zones(int priority, struct zonelist *zonelist,
-					struct scan_control *sc)
+					struct scan_control *sc, int prev_nr_slab)
 {
 	struct zoneref *z;
 	struct zone *zone;
@@ -2033,6 +2033,8 @@ static void shrink_zones(int priority, s
 		}

 		shrink_zone(priority, zone, sc);
+		if (prev_nr_slab == 0 && !zone_reclaimable(zone))
+			zone->all_unreclaimable = 1;
 	}
 }

@@ -2091,6 +2093,7 @@ static unsigned long do_try_to_free_page
 	struct zoneref *z;
 	struct zone *zone;
 	unsigned long writeback_threshold;
+	int prev_nr_slab = 1;

 	get_mems_allowed();
 	delayacct_freepages_start();
@@ -2102,7 +2105,9 @@ static unsigned long do_try_to_free_page
 		sc->nr_scanned = 0;
 		if (!priority)
 			disable_swap_token(sc->mem_cgroup);
-		shrink_zones(priority, zonelist, sc);
+		shrink_zones(priority, zonelist, sc, prev_nr_slab);
+		prev_nr_slab = 1;
+
 		/*
 		 * Don't shrink slabs when reclaiming memory from
 		 * over limit cgroups
@@ -2117,7 +2122,7 @@ static unsigned long do_try_to_free_page
 				lru_pages += zone_reclaimable_pages(zone);
 			}

-			shrink_slab(shrink, sc->nr_scanned, lru_pages);
+			prev_nr_slab = shrink_slab(shrink, sc->nr_scanned, lru_pages);
 			if (reclaim_state) {
 				sc->nr_reclaimed += reclaim_state->reclaimed_slab;
 				reclaim_state->reclaimed_slab = 0;

Regards,
Aaditya Kumar.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path
  2012-05-11  8:35 [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path Aaditya Kumar
@ 2012-05-11 21:26 ` KOSAKI Motohiro
  2012-05-14 17:03   ` Aaditya Kumar
  0 siblings, 1 reply; 5+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 21:26 UTC (permalink / raw)
  To: Aaditya Kumar
  Cc: linux-kernel, kosaki.motohiro, frank.rowand, tim.bird,
	takuzo.ohara, kan.iibuchi, kosaki.motohiro

(5/11/12 4:35 AM), Aaditya Kumar wrote:
> Dear All,
>
>   Commit 929bea7c714220fc76ce3f75bef9056477c28e74 seems to have broken the
>   OOM invocation during memory hot unplug in low memory conditions. This commit
>   modifies the 'un-reclaimabilty check'  for  giving up trying to
> reclaim pages in the
> direct reclaim path and invoke OOM  killer to be based on zone->unreclaimable.

Today I don't have a time and I didn't read your patch. but I would say my patch
passed my hotplug test. so can you please tell us your test case and reproducer?

I'll look into closely later.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path
  2012-05-11 21:26 ` KOSAKI Motohiro
@ 2012-05-14 17:03   ` Aaditya Kumar
  2012-06-12  1:21     ` KOSAKI Motohiro
  0 siblings, 1 reply; 5+ messages in thread
From: Aaditya Kumar @ 2012-05-14 17:03 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-kernel, frank.rowand, tim.bird, takuzo.ohara, kan.iibuchi,
	kosaki.motohiro

Dear Kosaki-san,

Konnichiwa,

> Today I don't have a time and I didn't read your patch. but I would say my
> patch
> passed my hotplug test. so can you please tell us your test case and
> reproducer?

My test case is as follows:
 1. Boot the kernel with 4 memory sections of 16MB each and each
belonging to a different NUMA node.
     We use 48MB for non movable memory ('kernelcore='). Also we do
NOT use swap devices.

 2. Execute a program that hogs memory (to be more precise, does
malloc()+ memset() on 70%
    of the total memory) and then sleeps.

 3. Try to offline all the memory sections one by one.

 On my board the kernel hangs in step three. From my debugging, I
found it is due to
 an infinite loop in direct reclaim path because in above test case
condition the
 zone->unreclaimable is not set, even though almost all pages are in
ISOLATED state.

 I am using a board not supported in mainline kernel but since the
code path of the
 problem is arch independent So I thought above test case should
reproduce on other
 architectures too and may be the patch can be helpful.

 Just FYI:
 We are using a modified kernel where we use different NUMA nodes for
different type
 of memory allocation (based on speed, readable-writable ,etc ) on
ARM. But these
 modifications are independent of code path of patch and we believe it
is independent of the
 problem.

Regards,
Aaditya Kumar.
Sony India Software Centre.

On Sat, May 12, 2012 at 2:56 AM, KOSAKI Motohiro
<kosaki.motohiro@gmail.com> wrote:
> (5/11/12 4:35 AM), Aaditya Kumar wrote:
>>
>> Dear All,
>>
>>  Commit 929bea7c714220fc76ce3f75bef9056477c28e74 seems to have broken the
>>  OOM invocation during memory hot unplug in low memory conditions. This
>> commit
>>  modifies the 'un-reclaimabilty check'  for  giving up trying to
>> reclaim pages in the
>> direct reclaim path and invoke OOM  killer to be based on
>> zone->unreclaimable.
>
>
> Today I don't have a time and I didn't read your patch. but I would say my
> patch
> passed my hotplug test. so can you please tell us your test case and
> reproducer?
>
> I'll look into closely later.
>
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path
  2012-05-14 17:03   ` Aaditya Kumar
@ 2012-06-12  1:21     ` KOSAKI Motohiro
  2012-06-19 13:20       ` Aaditya Kumar
  0 siblings, 1 reply; 5+ messages in thread
From: KOSAKI Motohiro @ 2012-06-12  1:21 UTC (permalink / raw)
  To: Aaditya Kumar
  Cc: linux-kernel, frank.rowand, tim.bird, takuzo.ohara, kan.iibuchi

On Mon, May 14, 2012 at 1:03 PM, Aaditya Kumar
<aaditya.kumar.30@gmail.com> wrote:
> Dear Kosaki-san,
>
> Konnichiwa,
>
>> Today I don't have a time and I didn't read your patch. but I would say my
>> patch
>> passed my hotplug test. so can you please tell us your test case and
>> reproducer?

Sorry for the long delay. Unfortunately your patch is racy and I can't take it.
but I made alternative fixes. Can you please try following patch?


http://permalink.gmane.org/gmane.linux.kernel.mm/79972

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path
  2012-06-12  1:21     ` KOSAKI Motohiro
@ 2012-06-19 13:20       ` Aaditya Kumar
  0 siblings, 0 replies; 5+ messages in thread
From: Aaditya Kumar @ 2012-06-19 13:20 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-kernel, frank.rowand, tim.bird, takuzo.ohara, kan.iibuchi

On Tue, Jun 12, 2012 at 6:51 AM, KOSAKI Motohiro
<kosaki.motohiro@gmail.com> wrote:
> On Mon, May 14, 2012 at 1:03 PM, Aaditya Kumar
> <aaditya.kumar.30@gmail.com> wrote:
>> Dear Kosaki-san,
>>
>> Konnichiwa,
>>
>>> Today I don't have a time and I didn't read your patch. but I would say my
>>> patch
>>> passed my hotplug test. so can you please tell us your test case and
>>> reproducer?
>
> Sorry for the long delay. Unfortunately your patch is racy and I can't take it.
> but I made alternative fixes. Can you please try following patch?

Hi Kosaki-san,

Thank you for the patch, it seems to fix the issue.

>
>
> http://permalink.gmane.org/gmane.linux.kernel.mm/79972


Regards,
Aaditya Kumar
Sony India Software Centre,
Bangalore.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-06-19 13:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-11  8:35 [RFC][PATCH] mm: Update zone->un_reclaimable in direct reclaim path Aaditya Kumar
2012-05-11 21:26 ` KOSAKI Motohiro
2012-05-14 17:03   ` Aaditya Kumar
2012-06-12  1:21     ` KOSAKI Motohiro
2012-06-19 13:20       ` Aaditya Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).