All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Thelen <gthelen@google.com>, Jan Kara <jack@suse.cz>,
	Ying Han <yinghan@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] mm: don't treat anonymous pages as dirtyable pages
Date: Fri, 2 Mar 2012 15:18:47 +0800	[thread overview]
Message-ID: <20120302071847.GA15654@localhost> (raw)
In-Reply-To: <20120302065947.GA9583@localhost>

The test results:

With the below heavy memory usage and one file copy from sparse file
to USB key under way,

root@snb /home/wfg/memcg-dirty/snb# free
             total       used       free     shared    buffers     cached
Mem:          6801       6750         50          0          0        893
-/+ buffers/cache:       5857        944
Swap:        51106         34      51072

There are no single reclaim waits:

/debug/vm/nr_reclaim_throttle_clean:0
/debug/vm/nr_reclaim_throttle_kswapd:0
/debug/vm/nr_reclaim_throttle_recent_write:0
/debug/vm/nr_reclaim_throttle_write:0
/debug/vm/nr_reclaim_wait_congested:0
/debug/vm/nr_reclaim_wait_writeback:0
/debug/vm/nr_migrate_wait_writeback:0

and only occasionally increase of

        /debug/vm/nr_congestion_wait (from kswapd)
        nr_vmscan_write
        allocstall

And the most visible thing: windows switching remains swiftly fast:

 time         window title
-----------------------------------------------------------------------------
 3024.91    A LibreOffice 3.4
 3024.97    A Restore Session - Iceweasel
 3024.98    A System Settings
 3025.13    A urxvt
 3025.14    A xeyes
 3025.15    A snb:/home/wfg - ZSH
 3025.16    A snb:/home/wfg - ZSH
 3025.17    A Xpdf: /usr/share/doc/shared-mime-info/shared-mime-info-spec.pdf
 3025.18    A OpenOffice.org
 3025.23    A OpenOffice.org
 3025.25    A OpenOffice.org
 3025.26    A OpenOffice.org
 3025.27    A OpenOffice.org
 3025.28    A Chess
 3025.29    A Dictionary
 3025.31    A System Monitor
 3025.35    A snb:/home/wfg - ZSH
 3025.41    A Desktop Help
 3025.43    A Mines
 3025.49    A Tetravex
 3025.54    A Iagno
 3025.55    A Four-in-a-row
 3025.60    A Mahjongg - Easy
 3025.64    A Klotski
 3025.66    A Five or More
 3025.68    A Tali
 3025.69    A Robots
 3025.71    A Klondike
 3025.79    A Home
 3025.82    A Home
 3025.86    A *Unsaved Document 1 - gedit
 3025.87    A Sudoku
 3025.93    A LibreOffice 3.4
 3025.98    A Restore Session - Iceweasel
 3025.99    A System Settings
 3026.13    A urxvt

Thanks,
Fengguang

> Assume a mem=1GB desktop (swap enabled) with 800MB anonymous pages and
> 200MB file pages.  When the user starts a heavy dirtier task, the file
> LRU lists may be mostly filled with dirty pages since the global dirty
> limit is calculated as
> 
> 	(anon+file) * 20% = 1GB * 20% = 200MB
> 
> This makes the file LRU lists hard to reclaim, which in turn increases
> the scan rate of the anon LRU lists and lead to a lot of swapping. This
> is probably one big reason why some desktop users see bad responsiveness
> during heavy file copies once the swap is enabled.
> 
> The heavy swapping could mostly be avoided by calculating the global
> dirty limit as
> 
> 	file * 20% = 200MB * 20% = 40MB
> 
> The side effect would be that users feel longer file copy time because
> the copy task is throttled earlier than before. However typical users
> should be much more sensible to interactive performance rather than the
> copy task which may well be leaved in the background.
> 
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
>  include/linux/vmstat.h |    1 -
>  mm/page-writeback.c    |   10 ++++++----
>  mm/vmscan.c            |   14 --------------
>  3 files changed, 6 insertions(+), 19 deletions(-)
> 
> --- linux.orig/include/linux/vmstat.h	2012-03-02 13:55:28.569749568 +0800
> +++ linux/include/linux/vmstat.h	2012-03-02 13:56:06.585750471 +0800
> @@ -139,7 +139,6 @@ static inline unsigned long zone_page_st
>  	return x;
>  }
>  
> -extern unsigned long global_reclaimable_pages(void);
>  extern unsigned long zone_reclaimable_pages(struct zone *zone);
>  
>  #ifdef CONFIG_NUMA
> --- linux.orig/mm/page-writeback.c	2012-03-02 13:55:28.549749567 +0800
> +++ linux/mm/page-writeback.c	2012-03-02 13:56:26.257750938 +0800
> @@ -181,8 +181,7 @@ static unsigned long highmem_dirtyable_m
>  		struct zone *z =
>  			&NODE_DATA(node)->node_zones[ZONE_HIGHMEM];
>  
> -		x += zone_page_state(z, NR_FREE_PAGES) +
> -		     zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> +		x += zone_dirtyable_memory(z);
>  	}
>  	/*
>  	 * Make sure that the number of highmem pages is never larger
> @@ -206,7 +205,9 @@ unsigned long global_dirtyable_memory(vo
>  {
>  	unsigned long x;
>  
> -	x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
> +	x = global_page_state(NR_FREE_PAGES) +
> +	    global_page_state(NR_ACTIVE_FILE) +
> +	    global_page_state(NR_INACTIVE_FILE) -
>  	    dirty_balance_reserve;
>  
>  	if (!vm_highmem_is_dirtyable)
> @@ -275,7 +276,8 @@ unsigned long zone_dirtyable_memory(stru
>  	 * care about vm_highmem_is_dirtyable here.
>  	 */
>  	return zone_page_state(zone, NR_FREE_PAGES) +
> -	       zone_reclaimable_pages(zone) -
> +	       zone_page_state(zone, NR_ACTIVE_FILE) +
> +	       zone_page_state(zone, NR_INACTIVE_FILE) -
>  	       zone->dirty_balance_reserve;
>  }
>  
> --- linux.orig/mm/vmscan.c	2012-03-02 13:55:28.561749567 +0800
> +++ linux/mm/vmscan.c	2012-03-02 13:56:06.585750471 +0800
> @@ -3315,20 +3315,6 @@ void wakeup_kswapd(struct zone *zone, in
>   * - mapped pages, which may require several travels to be reclaimed
>   * - dirty pages, which is not "instantly" reclaimable
>   */
> -unsigned long global_reclaimable_pages(void)
> -{
> -	int nr;
> -
> -	nr = global_page_state(NR_ACTIVE_FILE) +
> -	     global_page_state(NR_INACTIVE_FILE);
> -
> -	if (nr_swap_pages > 0)
> -		nr += global_page_state(NR_ACTIVE_ANON) +
> -		      global_page_state(NR_INACTIVE_ANON);
> -
> -	return nr;
> -}
> -
>  unsigned long zone_reclaimable_pages(struct zone *zone)
>  {
>  	int nr;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Fengguang Wu <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Thelen <gthelen@google.com>, Jan Kara <jack@suse.cz>,
	Ying Han <yinghan@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] mm: don't treat anonymous pages as dirtyable pages
Date: Fri, 2 Mar 2012 15:18:47 +0800	[thread overview]
Message-ID: <20120302071847.GA15654@localhost> (raw)
In-Reply-To: <20120302065947.GA9583@localhost>

The test results:

With the below heavy memory usage and one file copy from sparse file
to USB key under way,

root@snb /home/wfg/memcg-dirty/snb# free
             total       used       free     shared    buffers     cached
Mem:          6801       6750         50          0          0        893
-/+ buffers/cache:       5857        944
Swap:        51106         34      51072

There are no single reclaim waits:

/debug/vm/nr_reclaim_throttle_clean:0
/debug/vm/nr_reclaim_throttle_kswapd:0
/debug/vm/nr_reclaim_throttle_recent_write:0
/debug/vm/nr_reclaim_throttle_write:0
/debug/vm/nr_reclaim_wait_congested:0
/debug/vm/nr_reclaim_wait_writeback:0
/debug/vm/nr_migrate_wait_writeback:0

and only occasionally increase of

        /debug/vm/nr_congestion_wait (from kswapd)
        nr_vmscan_write
        allocstall

And the most visible thing: windows switching remains swiftly fast:

 time         window title
-----------------------------------------------------------------------------
 3024.91    A LibreOffice 3.4
 3024.97    A Restore Session - Iceweasel
 3024.98    A System Settings
 3025.13    A urxvt
 3025.14    A xeyes
 3025.15    A snb:/home/wfg - ZSH
 3025.16    A snb:/home/wfg - ZSH
 3025.17    A Xpdf: /usr/share/doc/shared-mime-info/shared-mime-info-spec.pdf
 3025.18    A OpenOffice.org
 3025.23    A OpenOffice.org
 3025.25    A OpenOffice.org
 3025.26    A OpenOffice.org
 3025.27    A OpenOffice.org
 3025.28    A Chess
 3025.29    A Dictionary
 3025.31    A System Monitor
 3025.35    A snb:/home/wfg - ZSH
 3025.41    A Desktop Help
 3025.43    A Mines
 3025.49    A Tetravex
 3025.54    A Iagno
 3025.55    A Four-in-a-row
 3025.60    A Mahjongg - Easy
 3025.64    A Klotski
 3025.66    A Five or More
 3025.68    A Tali
 3025.69    A Robots
 3025.71    A Klondike
 3025.79    A Home
 3025.82    A Home
 3025.86    A *Unsaved Document 1 - gedit
 3025.87    A Sudoku
 3025.93    A LibreOffice 3.4
 3025.98    A Restore Session - Iceweasel
 3025.99    A System Settings
 3026.13    A urxvt

Thanks,
Fengguang

> Assume a mem=1GB desktop (swap enabled) with 800MB anonymous pages and
> 200MB file pages.  When the user starts a heavy dirtier task, the file
> LRU lists may be mostly filled with dirty pages since the global dirty
> limit is calculated as
> 
> 	(anon+file) * 20% = 1GB * 20% = 200MB
> 
> This makes the file LRU lists hard to reclaim, which in turn increases
> the scan rate of the anon LRU lists and lead to a lot of swapping. This
> is probably one big reason why some desktop users see bad responsiveness
> during heavy file copies once the swap is enabled.
> 
> The heavy swapping could mostly be avoided by calculating the global
> dirty limit as
> 
> 	file * 20% = 200MB * 20% = 40MB
> 
> The side effect would be that users feel longer file copy time because
> the copy task is throttled earlier than before. However typical users
> should be much more sensible to interactive performance rather than the
> copy task which may well be leaved in the background.
> 
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
>  include/linux/vmstat.h |    1 -
>  mm/page-writeback.c    |   10 ++++++----
>  mm/vmscan.c            |   14 --------------
>  3 files changed, 6 insertions(+), 19 deletions(-)
> 
> --- linux.orig/include/linux/vmstat.h	2012-03-02 13:55:28.569749568 +0800
> +++ linux/include/linux/vmstat.h	2012-03-02 13:56:06.585750471 +0800
> @@ -139,7 +139,6 @@ static inline unsigned long zone_page_st
>  	return x;
>  }
>  
> -extern unsigned long global_reclaimable_pages(void);
>  extern unsigned long zone_reclaimable_pages(struct zone *zone);
>  
>  #ifdef CONFIG_NUMA
> --- linux.orig/mm/page-writeback.c	2012-03-02 13:55:28.549749567 +0800
> +++ linux/mm/page-writeback.c	2012-03-02 13:56:26.257750938 +0800
> @@ -181,8 +181,7 @@ static unsigned long highmem_dirtyable_m
>  		struct zone *z =
>  			&NODE_DATA(node)->node_zones[ZONE_HIGHMEM];
>  
> -		x += zone_page_state(z, NR_FREE_PAGES) +
> -		     zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> +		x += zone_dirtyable_memory(z);
>  	}
>  	/*
>  	 * Make sure that the number of highmem pages is never larger
> @@ -206,7 +205,9 @@ unsigned long global_dirtyable_memory(vo
>  {
>  	unsigned long x;
>  
> -	x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
> +	x = global_page_state(NR_FREE_PAGES) +
> +	    global_page_state(NR_ACTIVE_FILE) +
> +	    global_page_state(NR_INACTIVE_FILE) -
>  	    dirty_balance_reserve;
>  
>  	if (!vm_highmem_is_dirtyable)
> @@ -275,7 +276,8 @@ unsigned long zone_dirtyable_memory(stru
>  	 * care about vm_highmem_is_dirtyable here.
>  	 */
>  	return zone_page_state(zone, NR_FREE_PAGES) +
> -	       zone_reclaimable_pages(zone) -
> +	       zone_page_state(zone, NR_ACTIVE_FILE) +
> +	       zone_page_state(zone, NR_INACTIVE_FILE) -
>  	       zone->dirty_balance_reserve;
>  }
>  
> --- linux.orig/mm/vmscan.c	2012-03-02 13:55:28.561749567 +0800
> +++ linux/mm/vmscan.c	2012-03-02 13:56:06.585750471 +0800
> @@ -3315,20 +3315,6 @@ void wakeup_kswapd(struct zone *zone, in
>   * - mapped pages, which may require several travels to be reclaimed
>   * - dirty pages, which is not "instantly" reclaimable
>   */
> -unsigned long global_reclaimable_pages(void)
> -{
> -	int nr;
> -
> -	nr = global_page_state(NR_ACTIVE_FILE) +
> -	     global_page_state(NR_INACTIVE_FILE);
> -
> -	if (nr_swap_pages > 0)
> -		nr += global_page_state(NR_ACTIVE_ANON) +
> -		      global_page_state(NR_INACTIVE_ANON);
> -
> -	return nr;
> -}
> -
>  unsigned long zone_reclaimable_pages(struct zone *zone)
>  {
>  	int nr;

  reply	other threads:[~2012-03-02  7:18 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28 14:00 [PATCH 0/9] [RFC] pageout work and dirty reclaim throttling Fengguang Wu
2012-02-28 14:00 ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 1/9] memcg: add page_cgroup flags for dirty page tracking Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-29  0:50   ` KAMEZAWA Hiroyuki
2012-02-29  0:50     ` KAMEZAWA Hiroyuki
2012-03-04  1:29     ` Fengguang Wu
2012-03-04  1:29       ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 2/9] memcg: add dirty page accounting infrastructure Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-28 22:37   ` Andrew Morton
2012-02-28 22:37     ` Andrew Morton
2012-02-29  0:27     ` Fengguang Wu
2012-02-29  0:27       ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 3/9] memcg: add kernel calls for memcg dirty page stats Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-29  1:10   ` KAMEZAWA Hiroyuki
2012-02-29  1:10     ` KAMEZAWA Hiroyuki
2012-02-28 14:00 ` [PATCH 4/9] memcg: dirty page accounting support routines Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-28 15:15   ` Fengguang Wu
2012-02-28 15:15     ` Fengguang Wu
2012-02-28 22:45   ` Andrew Morton
2012-02-28 22:45     ` Andrew Morton
2012-02-29  1:15     ` KAMEZAWA Hiroyuki
2012-02-29  1:15       ` KAMEZAWA Hiroyuki
2012-02-28 14:00 ` [PATCH 5/9] writeback: introduce the pageout work Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-29  0:04   ` Andrew Morton
2012-02-29  0:04     ` Andrew Morton
2012-02-29  2:31     ` Fengguang Wu
2012-02-29  2:31       ` Fengguang Wu
2012-02-29 13:28     ` Fengguang Wu
2012-02-29 13:28       ` Fengguang Wu
2012-03-01 11:04     ` Jan Kara
2012-03-01 11:04       ` Jan Kara
2012-03-01 11:41       ` Fengguang Wu
2012-03-01 11:41         ` Fengguang Wu
2012-03-01 16:50         ` Jan Kara
2012-03-01 16:50           ` Jan Kara
2012-03-01 19:46         ` Andrew Morton
2012-03-01 19:46           ` Andrew Morton
2012-03-03 13:25           ` Fengguang Wu
2012-03-03 13:25             ` Fengguang Wu
2012-03-07  0:37             ` Andrew Morton
2012-03-07  0:37               ` Andrew Morton
2012-03-07  5:40               ` Fengguang Wu
2012-03-07  5:40                 ` Fengguang Wu
2012-03-01 19:42       ` Andrew Morton
2012-03-01 19:42         ` Andrew Morton
2012-03-01 21:15         ` Jan Kara
2012-03-01 21:15           ` Jan Kara
2012-03-01 21:22           ` Andrew Morton
2012-03-01 21:22             ` Andrew Morton
2012-03-01 12:36     ` Fengguang Wu
2012-03-01 12:36       ` Fengguang Wu
2012-03-01 16:38       ` Jan Kara
2012-03-01 16:38         ` Jan Kara
2012-03-02  4:48         ` Fengguang Wu
2012-03-02  4:48           ` Fengguang Wu
2012-03-02  9:59           ` Jan Kara
2012-03-02  9:59             ` Jan Kara
2012-03-02 10:39             ` Fengguang Wu
2012-03-02 10:39               ` Fengguang Wu
2012-03-02 19:57               ` Andrew Morton
2012-03-02 19:57                 ` Andrew Morton
2012-03-03 13:55                 ` Fengguang Wu
2012-03-03 13:55                   ` Fengguang Wu
2012-03-03 14:27                   ` Fengguang Wu
2012-03-03 14:27                     ` Fengguang Wu
2012-03-04 11:13                     ` Fengguang Wu
2012-03-04 11:13                       ` Fengguang Wu
2012-03-07 15:48                   ` Artem Bityutskiy
2012-03-07 15:48                     ` Artem Bityutskiy
2012-03-09  7:31                     ` Fengguang Wu
2012-03-09  7:31                       ` Fengguang Wu
2012-03-09  9:51                       ` Jan Kara
2012-03-09  9:51                         ` Jan Kara
2012-03-09 10:24                         ` Artem Bityutskiy
2012-03-09 10:24                           ` Artem Bityutskiy
2012-03-09 16:10                         ` Artem Bityutskiy
2012-03-09 16:10                           ` Artem Bityutskiy
2012-03-09 21:11                           ` Jan Kara
2012-03-09 21:11                             ` Jan Kara
2012-03-12 12:36                             ` Artem Bityutskiy
2012-03-12 12:36                               ` Artem Bityutskiy
2012-03-12 14:02                               ` Jan Kara
2012-03-12 14:02                                 ` Jan Kara
2012-03-12 14:21                                 ` Artem Bityutskiy
2012-03-12 14:21                                   ` Artem Bityutskiy
2012-03-09 10:15                   ` Jan Kara
2012-03-09 10:15                     ` Jan Kara
2012-03-09 15:10                     ` Fengguang Wu
2012-03-09 15:10                       ` Fengguang Wu
2012-02-29 13:51   ` [PATCH v2 " Fengguang Wu
2012-02-29 13:51     ` Fengguang Wu
2012-03-01 13:35     ` Fengguang Wu
2012-03-01 13:35       ` Fengguang Wu
2012-03-02  6:22       ` [PATCH v3 " Fengguang Wu
2012-03-02  6:22         ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 6/9] vmscan: dirty reclaim throttling Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 7/9] mm: pass __GFP_WRITE to memcg charge and reclaim routines Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 8/9] mm: dont set __GFP_WRITE on ramfs/sysfs writes Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-03-01 10:13   ` Johannes Weiner
2012-03-01 10:13     ` Johannes Weiner
2012-03-01 10:30     ` Fengguang Wu
2012-03-01 10:30       ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 9/9] mm: debug vmscan waits Fengguang Wu
2012-02-28 14:00   ` Fengguang Wu
2012-03-02  6:59   ` [RFC PATCH] mm: don't treat anonymous pages as dirtyable pages Fengguang Wu
2012-03-02  6:59     ` Fengguang Wu
2012-03-02  7:18     ` Fengguang Wu [this message]
2012-03-02  7:18       ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120302071847.GA15654@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.