Re: [RFC][PATCH] the proposal of improve page reclaim by throttle

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "minchan Kim" <barrioskmc@gmail.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH] the proposal of improve page reclaim by throttle
Date: Wed, 20 Feb 2008 17:56:02 +0900	[thread overview]
Message-ID: <44c63dc40802200056va847417v1cfc847341bb8cc0@mail.gmail.com> (raw)
In-Reply-To: <20080219134715.7E90.KOSAKI.MOTOHIRO@jp.fujitsu.com>

Hi, KOSAKI.

I am a many interested in your patch. so I want to test it with exact
same method as you did.
I will test it in embedded environment(ARM 920T, 32M ram) and my
desktop machine.(Core2Duo 2.2G, 2G ram)

I guess this patch won't be efficient in embedded environment.
Since many embedded board just have one processor and don't have any
swap device.

What I want to know is that this patch have a regression in UP and NO
swap device like embedded.
I think I can't show some field only top or freemem.
Becuase top or freemem won't be able to work well if system have a
great overhead with page reclaiming and swapping.
So, How do I evaluate following field as you did ?

 * elapse (what do you mean it ??)
 * major fault
 * max parallel reclaim tasks:
 *  max consumption time of
        try_to_free_pages():

If you have a patch for testing, Let me receive it.

On Feb 19, 2008 2:44 PM, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> background
> ========================================
> current VM implementation doesn't has limit of # of parallel reclaim.
> when heavy workload, it bring to 2 bad things
>  - heavy lock contention
>  - unnecessary swap out
>
> abount 2 month ago, KAMEZA Hiroyuki proposed the patch of page
> reclaim throttle and explain it improve reclaim time.
>        http://marc.info/?l=linux-mm&m=119667465917215&w=2
>
> but unfortunately it works only memcgroup reclaim.
> Today, I implement it again for support global reclaim and mesure it.
>
>
> test machine, method and result
> ==================================================
> <test machine>
>        CPU:  IA64 x8
>        MEM:  8GB
>        SWAP: 2GB
>
> <test method>
>        got hackbench from
>                http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c
>
>        $ /usr/bin/time hackbench 120 process 1000
>
>        this parameter mean consume all physical memory and
>        1GB swap space on my test environment.
>
> <test result (average of 3 times measurement)>
>
> before:
>        hackbench result:               282.30
>        /usr/bin/time result
>                user:                   14.16
>                sys:                    1248.47
>                elapse:                 432.93
>                major fault:            29026
>        max parallel reclaim tasks:     1298
>        max consumption time of
>         try_to_free_pages():           70394
>
> after:
>        hackbench result:               30.36
>        /usr/bin/time result
>                user:                   14.26
>                sys:                    294.44
>                elapse:                 118.01
>                major fault:            3064
>        max parallel reclaim tasks:     4
>        max consumption time of
>         try_to_free_pages():           12234
>
>
> conclusion
> =========================================
> this patch improve 3 things.
> 1. reduce unnecessary swap
>   (see above major fault. about 90% reduced)
> 2. improve throughput performance
>   (see above hackbench result. about 90% reduced)
> 3. improve interactive performance.
>   (see above max consumption of try_to_free_pages.
>    about 80% reduced)
> 4. reduce lock contention.
>   (see above sys time. about 80% reduced)
>
>
> Now, we got about 1000% performance improvement of hackbench :)
>
>
>
> foture works
> ==========================================================
>  - more discussion with memory controller guys.
>
>
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> CC: Rik van Riel <riel@redhat.com>
> CC: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
>
> ---
>  include/linux/nodemask.h |    1
>  mm/vmscan.c              |   49 +++++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 48 insertions(+), 2 deletions(-)
>
> Index: b/include/linux/nodemask.h
> ===================================================================
> --- a/include/linux/nodemask.h  2008-02-19 13:58:05.000000000 +0900
> +++ b/include/linux/nodemask.h  2008-02-19 13:58:23.000000000 +0900
> @@ -431,6 +431,7 @@ static inline int num_node_state(enum no
>
>  #define num_online_nodes()     num_node_state(N_ONLINE)
>  #define num_possible_nodes()   num_node_state(N_POSSIBLE)
> +#define num_highmem_nodes()    num_node_state(N_HIGH_MEMORY)
>  #define node_online(node)      node_state((node), N_ONLINE)
>  #define node_possible(node)    node_state((node), N_POSSIBLE)
>
> Index: b/mm/vmscan.c
> ===================================================================
> --- a/mm/vmscan.c       2008-02-19 13:58:05.000000000 +0900
> +++ b/mm/vmscan.c       2008-02-19 14:04:06.000000000 +0900
> @@ -127,6 +127,11 @@ long vm_total_pages;       /* The total number
>  static LIST_HEAD(shrinker_list);
>  static DECLARE_RWSEM(shrinker_rwsem);
>
> +static atomic_t nr_reclaimers = ATOMIC_INIT(0);
> +static DECLARE_WAIT_QUEUE_HEAD(reclaim_throttle_waitq);
> +#define RECLAIM_LIMIT (2 * num_highmem_nodes())
> +
> +
>  #ifdef CONFIG_CGROUP_MEM_CONT
>  #define scan_global_lru(sc)    (!(sc)->mem_cgroup)
>  #else
> @@ -1421,6 +1426,46 @@ out:
>        return ret;
>  }
>
> +static unsigned long try_to_free_pages_throttled(struct zone **zones,
> +                                                int order,
> +                                                gfp_t gfp_mask,
> +                                                struct scan_control *sc)
> +{
> +       unsigned long nr_reclaimed = 0;
> +       unsigned long start_time;
> +       int i;
> +
> +       start_time = jiffies;
> +
> +       wait_event(reclaim_throttle_waitq,
> +                  atomic_add_unless(&nr_reclaimers, 1, RECLAIM_LIMIT));
> +
> +       /* more reclaim until needed? */
> +       if (unlikely(time_after(jiffies, start_time + HZ))) {
> +               for (i = 0; zones[i] != NULL; i++) {
> +                       struct zone *zone = zones[i];
> +                       int classzone_idx = zone_idx(zones[0]);
> +
> +                       if (!populated_zone(zone))
> +                               continue;
> +
> +                       if (zone_watermark_ok(zone, order, 4*zone->pages_high,
> +                                             classzone_idx, 0)) {
> +                               nr_reclaimed = 1;
> +                               goto out;
> +                       }
> +               }
> +       }
> +
> +       nr_reclaimed = do_try_to_free_pages(zones, gfp_mask, sc);
> +
> +out:
> +       atomic_dec(&nr_reclaimers);
> +       wake_up_all(&reclaim_throttle_waitq);
> +
> +       return nr_reclaimed;
> +}
> +
>  unsigned long try_to_free_pages(struct zone **zones, int order, gfp_t gfp_mask)
>  {
>        struct scan_control sc = {
> @@ -1434,7 +1479,7 @@ unsigned long try_to_free_pages(struct z
>                .isolate_pages = isolate_pages_global,
>        };
>
> -       return do_try_to_free_pages(zones, gfp_mask, &sc);
> +       return try_to_free_pages_throttled(zones, order, gfp_mask, &sc);
>  }
>
>  #ifdef CONFIG_CGROUP_MEM_CONT
> @@ -1456,7 +1501,7 @@ unsigned long try_to_free_mem_cgroup_pag
>        int target_zone = gfp_zone(GFP_HIGHUSER_MOVABLE);
>
>        zones = NODE_DATA(numa_node_id())->node_zonelists[target_zone].zones;
> -       if (do_try_to_free_pages(zones, sc.gfp_mask, &sc))
> +       if (try_to_free_pages_throttled(zones, 0, sc.gfp_mask, &sc))
>                return 1;
>        return 0;
>  }
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Thanks,
barrios

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-02-20  8:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-19  5:44 [RFC][PATCH] the proposal of improve page reclaim by throttle KOSAKI Motohiro
2008-02-19  6:34 ` Nick Piggin
2008-02-19  7:09   ` KOSAKI Motohiro
2008-02-19 13:31   ` Rik van Riel
2008-02-20  8:56 ` minchan Kim [this message]
2008-02-20  9:24   ` KOSAKI Motohiro
2008-02-20  9:49     ` minchan Kim
2008-02-20 10:09       ` KOSAKI Motohiro
2008-02-21  9:38         ` minchan Kim
2008-02-21 10:55           ` KOSAKI Motohiro
2008-02-21 12:29             ` minchan Kim
2008-02-21 12:41               ` KOSAKI Motohiro
2008-02-21  9:48 ` Balbir Singh
2008-02-21 11:01   ` KOSAKI Motohiro
2008-02-21 11:02     ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44c63dc40802200056va847417v1cfc847341bb8cc0@mail.gmail.com \
    --to=barrioskmc@gmail.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).