linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] memory-hotplug: fix zone stat mismatch
@ 2012-09-20  6:43 Minchan Kim
  2012-09-20  7:21 ` Yasuaki Ishimatsu
  2012-09-20 21:42 ` Andrew Morton
  0 siblings, 2 replies; 5+ messages in thread
From: Minchan Kim @ 2012-09-20  6:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Minchan Kim, Kamezawa Hiroyuki,
	Yasuaki Ishimatsu, KOSAKI Motohiro

During memory-hotplug, I found NR_ISOLATED_[ANON|FILE]
are increasing so that kernel are hang out.

The cause is that when we do memory-hotadd after memory-remove,
__zone_pcp_update clear out zone's ZONE_STAT_ITEMS in setup_pageset
although vm_stat_diff of all CPU still have value.

In addtion, when we offline all pages of the zone, we reset them
in zone_pcp_reset without drain so that we lost zone stat item.

This patch fixes it.

* from v2
  * Add Reviewed-by - Wen

* from v1
  * drain offline patch - KOSAKI, Wen

Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 include/linux/vmstat.h |    4 ++++
 mm/page_alloc.c        |    7 +++++++
 mm/vmstat.c            |   12 ++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index ad2cfd5..5d31876 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -198,6 +198,8 @@ extern void __dec_zone_state(struct zone *, enum zone_stat_item);
 void refresh_cpu_vm_stats(int);
 void refresh_zone_stat_thresholds(void);
 
+void drain_zonestat(struct zone *zone, struct per_cpu_pageset *);
+
 int calculate_pressure_threshold(struct zone *zone);
 int calculate_normal_threshold(struct zone *zone);
 void set_pgdat_percpu_threshold(pg_data_t *pgdat,
@@ -251,6 +253,8 @@ static inline void __dec_zone_page_state(struct page *page,
 static inline void refresh_cpu_vm_stats(int cpu) { }
 static inline void refresh_zone_stat_thresholds(void) { }
 
+static inline void drain_zonestat(struct zone *zone,
+			struct per_cpu_pageset *pset) { }
 #endif		/* CONFIG_SMP */
 
 extern const char * const vmstat_text[];
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ab58346..980f2e7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5904,6 +5904,7 @@ static int __meminit __zone_pcp_update(void *data)
 		local_irq_save(flags);
 		if (pcp->count > 0)
 			free_pcppages_bulk(zone, pcp->count, pcp);
+		drain_zonestat(zone, pset);
 		setup_pageset(pset, batch);
 		local_irq_restore(flags);
 	}
@@ -5920,10 +5921,16 @@ void __meminit zone_pcp_update(struct zone *zone)
 void zone_pcp_reset(struct zone *zone)
 {
 	unsigned long flags;
+	int cpu;
+	struct per_cpu_pageset *pset;
 
 	/* avoid races with drain_pages()  */
 	local_irq_save(flags);
 	if (zone->pageset != &boot_pageset) {
+		for_each_online_cpu(cpu) {
+			pset = per_cpu_ptr(zone->pageset, cpu);
+			drain_zonestat(zone, pset);
+		}
 		free_percpu(zone->pageset);
 		zone->pageset = &boot_pageset;
 	}
diff --git a/mm/vmstat.c b/mm/vmstat.c
index b3e3b9d..d4cc1c2 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -495,6 +495,18 @@ void refresh_cpu_vm_stats(int cpu)
 			atomic_long_add(global_diff[i], &vm_stat[i]);
 }
 
+void drain_zonestat(struct zone *zone, struct per_cpu_pageset *pset)
+{
+	int i;
+
+	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+		if (pset->vm_stat_diff[i]) {
+			int v = pset->vm_stat_diff[i];
+			pset->vm_stat_diff[i] = 0;
+			atomic_long_add(v, &zone->vm_stat[i]);
+			atomic_long_add(v, &vm_stat[i]);
+		}
+}
 #endif
 
 #ifdef CONFIG_NUMA
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] memory-hotplug: fix zone stat mismatch
  2012-09-20  6:43 [PATCH v3] memory-hotplug: fix zone stat mismatch Minchan Kim
@ 2012-09-20  7:21 ` Yasuaki Ishimatsu
  2012-09-20  7:37   ` Minchan Kim
  2012-09-20 21:42 ` Andrew Morton
  1 sibling, 1 reply; 5+ messages in thread
From: Yasuaki Ishimatsu @ 2012-09-20  7:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, linux-kernel, Kamezawa Hiroyuki,
	KOSAKI Motohiro

Hi Minchan,

Sorry for late reply.

2012/09/20 15:43, Minchan Kim wrote:
> During memory-hotplug, I found NR_ISOLATED_[ANON|FILE]
> are increasing so that kernel are hang out.

Why does your system hang out by increasing NR_ISOLATED_[ANON|FILE]?
I cannot understand what has happened by your system.

Thanks,
Yasuaki Ishimatsu

> 
> The cause is that when we do memory-hotadd after memory-remove,
> __zone_pcp_update clear out zone's ZONE_STAT_ITEMS in setup_pageset
> although vm_stat_diff of all CPU still have value.
> 
> In addtion, when we offline all pages of the zone, we reset them
> in zone_pcp_reset without drain so that we lost zone stat item.
> 
> This patch fixes it.
> 
> * from v2
>    * Add Reviewed-by - Wen
> 
> * from v1
>    * drain offline patch - KOSAKI, Wen
> 
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>   include/linux/vmstat.h |    4 ++++
>   mm/page_alloc.c        |    7 +++++++
>   mm/vmstat.c            |   12 ++++++++++++
>   3 files changed, 23 insertions(+)
> 
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index ad2cfd5..5d31876 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -198,6 +198,8 @@ extern void __dec_zone_state(struct zone *, enum zone_stat_item);
>   void refresh_cpu_vm_stats(int);
>   void refresh_zone_stat_thresholds(void);
>   
> +void drain_zonestat(struct zone *zone, struct per_cpu_pageset *);
> +
>   int calculate_pressure_threshold(struct zone *zone);
>   int calculate_normal_threshold(struct zone *zone);
>   void set_pgdat_percpu_threshold(pg_data_t *pgdat,
> @@ -251,6 +253,8 @@ static inline void __dec_zone_page_state(struct page *page,
>   static inline void refresh_cpu_vm_stats(int cpu) { }
>   static inline void refresh_zone_stat_thresholds(void) { }
>   
> +static inline void drain_zonestat(struct zone *zone,
> +			struct per_cpu_pageset *pset) { }
>   #endif		/* CONFIG_SMP */
>   
>   extern const char * const vmstat_text[];
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ab58346..980f2e7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5904,6 +5904,7 @@ static int __meminit __zone_pcp_update(void *data)
>   		local_irq_save(flags);
>   		if (pcp->count > 0)
>   			free_pcppages_bulk(zone, pcp->count, pcp);
> +		drain_zonestat(zone, pset);
>   		setup_pageset(pset, batch);
>   		local_irq_restore(flags);
>   	}
> @@ -5920,10 +5921,16 @@ void __meminit zone_pcp_update(struct zone *zone)
>   void zone_pcp_reset(struct zone *zone)
>   {
>   	unsigned long flags;
> +	int cpu;
> +	struct per_cpu_pageset *pset;
>   
>   	/* avoid races with drain_pages()  */
>   	local_irq_save(flags);
>   	if (zone->pageset != &boot_pageset) {
> +		for_each_online_cpu(cpu) {
> +			pset = per_cpu_ptr(zone->pageset, cpu);
> +			drain_zonestat(zone, pset);
> +		}
>   		free_percpu(zone->pageset);
>   		zone->pageset = &boot_pageset;
>   	}
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index b3e3b9d..d4cc1c2 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -495,6 +495,18 @@ void refresh_cpu_vm_stats(int cpu)
>   			atomic_long_add(global_diff[i], &vm_stat[i]);
>   }
>   
> +void drain_zonestat(struct zone *zone, struct per_cpu_pageset *pset)
> +{
> +	int i;
> +
> +	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
> +		if (pset->vm_stat_diff[i]) {
> +			int v = pset->vm_stat_diff[i];
> +			pset->vm_stat_diff[i] = 0;
> +			atomic_long_add(v, &zone->vm_stat[i]);
> +			atomic_long_add(v, &vm_stat[i]);
> +		}
> +}
>   #endif
>   
>   #ifdef CONFIG_NUMA
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] memory-hotplug: fix zone stat mismatch
  2012-09-20  7:21 ` Yasuaki Ishimatsu
@ 2012-09-20  7:37   ` Minchan Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2012-09-20  7:37 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Andrew Morton, linux-mm, linux-kernel, Kamezawa Hiroyuki,
	KOSAKI Motohiro

Hi Yasuaki,

On Thu, Sep 20, 2012 at 04:21:13PM +0900, Yasuaki Ishimatsu wrote:
> Hi Minchan,
> 
> Sorry for late reply.
> 
> 2012/09/20 15:43, Minchan Kim wrote:
> > During memory-hotplug, I found NR_ISOLATED_[ANON|FILE]
> > are increasing so that kernel are hang out.
> 
> Why does your system hang out by increasing NR_ISOLATED_[ANON|FILE]?
> I cannot understand what has happened by your system.

If system doesn't have enough free page, it goes reclaim path and never
reclaim any pages by too_many_isolated and loop forever.

--
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] memory-hotplug: fix zone stat mismatch
  2012-09-20  6:43 [PATCH v3] memory-hotplug: fix zone stat mismatch Minchan Kim
  2012-09-20  7:21 ` Yasuaki Ishimatsu
@ 2012-09-20 21:42 ` Andrew Morton
  2012-09-20 23:36   ` Minchan Kim
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2012-09-20 21:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Kamezawa Hiroyuki, Yasuaki Ishimatsu,
	KOSAKI Motohiro

On Thu, 20 Sep 2012 15:43:25 +0900
Minchan Kim <minchan@kernel.org> wrote:

> During memory-hotplug, I found NR_ISOLATED_[ANON|FILE]
> are increasing so that kernel are hang out.
> 
> The cause is that when we do memory-hotadd after memory-remove,
> __zone_pcp_update clear out zone's ZONE_STAT_ITEMS in setup_pageset
> although vm_stat_diff of all CPU still have value.
> 
> In addtion, when we offline all pages of the zone, we reset them
> in zone_pcp_reset without drain so that we lost zone stat item.
> 

Here's what I ended up with for a changelog:

: During memory-hotplug, I found NR_ISOLATED_[ANON|FILE] are increasing,
: causing the kernel to hang.  When the system doesn't have enough free
: pages, it enters reclaim but never reclaim any pages due to
: too_many_isolated()==true and loops forever.
: 
: The cause is that when we do memory-hotadd after memory-remove,
: __zone_pcp_update() clears a zone's ZONE_STAT_ITEMS in setup_pageset()
: although the vm_stat_diff of all CPUs still have values.
: 
: In addtion, when we offline all pages of the zone, we reset them in
: zone_pcp_reset without draining so we loss some zone stat item.


As memory hotplug seems fairly immature and broken, I'm thinking
there's no point in backporting this into -stable.  And I don't *think*
we really need it in 3.6 either?  (It doesn't apply cleanly to current
mainline anyway - I didn't check why).


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] memory-hotplug: fix zone stat mismatch
  2012-09-20 21:42 ` Andrew Morton
@ 2012-09-20 23:36   ` Minchan Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2012-09-20 23:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Kamezawa Hiroyuki, Yasuaki Ishimatsu,
	KOSAKI Motohiro

On Thu, Sep 20, 2012 at 02:42:32PM -0700, Andrew Morton wrote:
> On Thu, 20 Sep 2012 15:43:25 +0900
> Minchan Kim <minchan@kernel.org> wrote:
> 
> > During memory-hotplug, I found NR_ISOLATED_[ANON|FILE]
> > are increasing so that kernel are hang out.
> > 
> > The cause is that when we do memory-hotadd after memory-remove,
> > __zone_pcp_update clear out zone's ZONE_STAT_ITEMS in setup_pageset
> > although vm_stat_diff of all CPU still have value.
> > 
> > In addtion, when we offline all pages of the zone, we reset them
> > in zone_pcp_reset without drain so that we lost zone stat item.
> > 
> 
> Here's what I ended up with for a changelog:
> 
> : During memory-hotplug, I found NR_ISOLATED_[ANON|FILE] are increasing,
> : causing the kernel to hang.  When the system doesn't have enough free
> : pages, it enters reclaim but never reclaim any pages due to
> : too_many_isolated()==true and loops forever.
> : 
> : The cause is that when we do memory-hotadd after memory-remove,
> : __zone_pcp_update() clears a zone's ZONE_STAT_ITEMS in setup_pageset()
> : although the vm_stat_diff of all CPUs still have values.
> : 
> : In addtion, when we offline all pages of the zone, we reset them in
> : zone_pcp_reset without draining so we loss some zone stat item.
> 

Thanks for clarifying the description, Andrew!

> 
> As memory hotplug seems fairly immature and broken, I'm thinking
> there's no point in backporting this into -stable.  And I don't *think*

I have no idea usecase of memory-hotplug in real practice.
If they do a ton of memory-hotadd/delete without rebooting
zone stat could be wrong. And it could turn for the worse
in case of using many CPUs.

Other zone stat items are not critical other than NR_ISOLATED
which could make system hang when VM start to reclaim heavily.
Anyway, If fujitsu guys don't yell, I'm okay. :)

> we really need it in 3.6 either?  (It doesn't apply cleanly to current
> mainline anyway - I didn't check why).

At least, it works in my side.

barrios@bbox:~/linux-2.6$ git log -n 1
commit c46de2263f42fb4bbde411b9126f471e9343cb22
Merge: 077fee0 2453f5f
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Sep 19 11:04:34 2012 -0700

    Merge branch 'for-linus' of git://git.kernel.dk/linux-block
    
    Pull block fixes from Jens Axboe:
     "A small collection of driver fixes/updates and a core fix for 3.6.  It
      contains:
    
       - Bug fixes for mtip32xx, and support for new hardware (just addition
         of IDs).  They have been queued up for 3.7 for a few weeks as well.
    
       - rate-limit a failing command error message in block core.
    
       - A fix for an old cciss bug from Stephen.
    
       - Prevent overflow of partition count from Alan."
    
    * 'for-linus' of git://git.kernel.dk/linux-block:
      cciss: fix handling of protocol error
      blk: add an upper sanity check on partition adding
      mtip32xx: fix user_buffer check in exec_drive_command
      mtip32xx: Remove dead code
      mtip32xx: Change printk to pr_xxxx
      mtip32xx: Proper reporting of write protect status on big-endian
      mtip32xx: Increase timeout for standby command
      mtip32xx: Handle NCQ commands during the security locked state
      mtip32xx: Add support for new devices
      block: rate-limit the error message from failing commands

barrios@bbox:~/linux-2.6$ patch -p1 < ../linux-mmotm/0001-memory-hotplug-fix-zone-stat-mismatch.patch --dry-run
patching file include/linux/vmstat.h
patching file mm/page_alloc.c
Hunk #1 succeeded at 5874 (offset -30 lines).
Hunk #2 succeeded at 5891 (offset -30 lines).
patching file mm/vmstat.c

> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-09-20 23:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-20  6:43 [PATCH v3] memory-hotplug: fix zone stat mismatch Minchan Kim
2012-09-20  7:21 ` Yasuaki Ishimatsu
2012-09-20  7:37   ` Minchan Kim
2012-09-20 21:42 ` Andrew Morton
2012-09-20 23:36   ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).