linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Heinrich Schuchardt <xypron.glpk@gmx.de>
To: Pintu Kumar <pintu.k@samsung.com>,
	corbet@lwn.net, akpm@linux-foundation.org, vbabka@suse.cz,
	gorcunov@openvz.org, mhocko@suse.cz, emunson@akamai.com,
	kirill.shutemov@linux.intel.com, standby24x7@gmail.com,
	hannes@cmpxchg.org, vdavydov@parallels.com, hughd@google.com,
	minchan@kernel.org, tj@kernel.org, rientjes@google.com,
	dzickus@redhat.com, prarit@redhat.com, ebiederm@xmission.com,
	rostedt@goodmis.org, uobergfe@redhat.com,
	paulmck@linux.vnet.ibm.com, iamjoonsoo.kim@lge.com,
	ddstreet@ieee.org, sasha.levin@oracle.com, koct9i@gmail.com,
	mgorman@suse.de, cj@linux.com, opensource.ganesh@gmail.com,
	vinmenon@codeaurora.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-pm@vger.kernel.org
Cc: cpgs@samsung.com, pintu_agarwal@yahoo.com, vishnu.ps@samsung.com,
	rohit.kr@samsung.com, iqbal.ams@samsung.com
Subject: Re: [PATCH 1/1] kernel/sysctl.c: Add /proc/sys/vm/shrink_memory feature
Date: Fri, 03 Jul 2015 19:21:19 +0200	[thread overview]
Message-ID: <5596C48F.8050800@gmx.de> (raw)
In-Reply-To: <1435929607-3435-1-git-send-email-pintu.k@samsung.com>

On 03.07.2015 15:20, Pintu Kumar wrote:
> This patch provides 2 things:
> 1. Add new control called shrink_memory in /proc/sys/vm/.
> This control can be used to aggressively reclaim memory system-wide
> in one shot from the user space. A value of 1 will instruct the
> kernel to reclaim as much as totalram_pages in the system.
> Example: echo 1 > /proc/sys/vm/shrink_memory
> 
> 2. Enable shrink_all_memory API in kernel with new CONFIG_SHRINK_MEMORY.
> Currently, shrink_all_memory function is used only during hibernation.
> With the new config we can make use of this API for non-hibernation case
> also without disturbing the hibernation case.
> 
> The detailed paper was presented in Embedded Linux Conference, Mar-2015
> http://events.linuxfoundation.org/sites/events/files/slides/
> %5BELC-2015%5D-System-wide-Memory-Defragmenter.pdf
> 
> Scenarios were this can be used and helpful are:
> 1) Can be invoked just after system boot-up is finished.
> 2) Can be invoked just before entering entire system suspend.
> 3) Can be invoked from kernel when order-4 pages starts failing.
> 4) Can be helpful to completely avoid or delay the kerenl OOM condition.
> 5) Can be developed as a system-tool to quickly defragment entire system
>    from user space, without the need to kill any application.
> 
> Signed-off-by: Pintu Kumar <pintu.k@samsung.com>
> ---
>  Documentation/sysctl/vm.txt |   16 ++++++++++++++++
>  include/linux/swap.h        |    7 +++++++
>  kernel/sysctl.c             |    9 +++++++++
>  mm/Kconfig                  |    8 ++++++++
>  mm/vmscan.c                 |   23 +++++++++++++++++++++--
>  5 files changed, 61 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index 9832ec5..a959ad1 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -54,6 +54,7 @@ Currently, these files are in /proc/sys/vm:
>  - page-cluster
>  - panic_on_oom
>  - percpu_pagelist_fraction
> +- shrink_memory
>  - stat_interval
>  - swappiness
>  - user_reserve_kbytes
> @@ -718,6 +719,21 @@ sysctl, it will revert to this default behavior.
>  
>  ==============================================================
>  
> +shrink_memory
> +
> +This control is available only when CONFIG_SHRINK_MEMORY is set. This control
> +can be used to aggressively reclaim memory system-wide in one shot. A value of
> +1 will instruct the kernel to reclaim as much as totalram_pages in the system.
> +For example, to reclaim all memory system-wide we can do:
> +# echo 1 > /proc/sys/vm/shrink_memory

The API should be as restrictive as possible to allow for extensibility.

You describe "1" as the only used value. So, please add here:

"If any other value than 1 is written to shrink_memory an error
EINVAL occurs."

> +
> +For more information about this control, please visit the following
> +presentation in embedded linux conference, 2015.
> +http://events.linuxfoundation.org/sites/events/files/slides/
> +%5BELC-2015%5D-System-wide-Memory-Defragmenter.pdf
> +
> +==============================================================
> +
>  stat_interval
>  
>  The time interval between which vm statistics are updated.  The default
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 9a7adfb..6505b0b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -333,6 +333,13 @@ extern int vm_swappiness;
>  extern int remove_mapping(struct address_space *mapping, struct page *page);
>  extern unsigned long vm_total_pages;
>  
> +#ifdef CONFIG_SHRINK_MEMORY
> +extern int sysctl_shrink_memory;
> +extern int sysctl_shrinkmem_handler(struct ctl_table *table, int write,
> +		void __user *buffer, size_t *length, loff_t *ppos);
> +#endif
> +
> +
>  #ifdef CONFIG_NUMA
>  extern int zone_reclaim_mode;
>  extern int sysctl_min_unmapped_ratio;
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index c566b56..2895099 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -1351,6 +1351,15 @@ static struct ctl_table vm_table[] = {
>  	},
>  
>  #endif /* CONFIG_COMPACTION */
> +#ifdef CONFIG_SHRINK_MEMORY
> +	{
> +		.procname	= "shrink_memory",
> +		.data		= &sysctl_shrink_memory,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0200,
> +		.proc_handler	= sysctl_shrinkmem_handler,

Supply the value limits.

int min_shrink_memory = 1;
int max_shrink_memory = 1;

.extra1 = &min_shrink_memory,
.extra2 = &max_shrink_memory,

> +	},
> +#endif
>  	{
>  		.procname	= "min_free_kbytes",
>  		.data		= &min_free_kbytes,
> diff --git a/mm/Kconfig b/mm/Kconfig
> index b3a60ee..8e04bd9 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -657,3 +657,11 @@ config DEFERRED_STRUCT_PAGE_INIT
>  	  when kswapd starts. This has a potential performance impact on
>  	  processes running early in the lifetime of the systemm until kswapd
>  	  finishes the initialisation.
> +
> +config SHRINK_MEMORY
> +	bool "Allow for system-wide shrinking of memory"
> +	default n
> +	depends on MMU
> +	help
> +	  It enables support for system-wide memory reclaim in one shot using
> +	  echo 1 > /proc/sys/vm/shrink_memory.
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c8d8282..837b88d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3557,7 +3557,7 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
>  	wake_up_interruptible(&pgdat->kswapd_wait);
>  }
>  
> -#ifdef CONFIG_HIBERNATION
> +#if defined CONFIG_HIBERNATION || CONFIG_SHRINK_MEMORY
>  /*
>   * Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
>   * freed pages.
> @@ -3571,12 +3571,17 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
>  	struct reclaim_state reclaim_state;
>  	struct scan_control sc = {
>  		.nr_to_reclaim = nr_to_reclaim,
> +#ifdef CONFIG_SHRINK_MEMORY
> +		.gfp_mask = (GFP_HIGHUSER_MOVABLE | GFP_RECLAIM_MASK),
> +		.hibernation_mode = 0,
> +#else
>  		.gfp_mask = GFP_HIGHUSER_MOVABLE,
> +		.hibernation_mode = 1,
> +#endif
>  		.priority = DEF_PRIORITY,
>  		.may_writepage = 1,
>  		.may_unmap = 1,
>  		.may_swap = 1,
> -		.hibernation_mode = 1,
>  	};
>  	struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
>  	struct task_struct *p = current;
> @@ -3597,6 +3602,20 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
>  }
>  #endif /* CONFIG_HIBERNATION */
>  
> +#ifdef CONFIG_SHRINK_MEMORY
> +int sysctl_shrink_memory;
> +/* This is the entry point for system-wide shrink memory
> ++via /proc/sys/vm/shrink_memory */
> +int sysctl_shrinkmem_handler(struct ctl_table *table, int write,
> +		void __user *buffer, size_t *length, loff_t *ppos)
> +{

Check if *buffer contains "1". If the value is not "1" return -EINVAL.

The check can be done using function proc_dointvec_minmax().

Best regards

Heinrich Schuchardt

> +	if (write)
> +		shrink_all_memory(totalram_pages);
> +
> +	return 0;
> +}
> +#endif
> +
>  /* It's optimal to keep kswapds on the same CPUs as their memory, but
>     not required for correctness.  So if the last cpu in a node goes
>     away, we get changed to run anywhere: as the first one comes back,
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-07-03 17:22 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-03 13:20 [PATCH 1/1] kernel/sysctl.c: Add /proc/sys/vm/shrink_memory feature Pintu Kumar
2015-07-03 17:21 ` Heinrich Schuchardt [this message]
2015-07-06 13:52   ` PINTU KUMAR
2015-07-03 18:38 ` Johannes Weiner
2015-07-04  6:04   ` PINTU KUMAR
2015-07-04 12:55     ` Johannes Weiner
2015-07-04  8:05   ` PINTU KUMAR
2015-07-04 20:08 ` Valdis.Kletnieks
2015-07-05  2:15   ` PINTU KUMAR
2015-07-05  8:20   ` PINTU KUMAR
2015-07-06 16:51   ` PINTU KUMAR
2015-07-06 10:22 ` Xishi Qiu
2015-07-06 14:03   ` PINTU KUMAR
2015-07-07  1:38     ` Xishi Qiu
2015-07-07  7:00       ` PINTU KUMAR
2015-07-17  6:29 ` [PATCHv2 " Pintu Kumar
2015-07-17  6:58   ` PINTU KUMAR
2015-07-20  4:29   ` [PATCH v3 " Pintu Kumar
2015-07-20  8:28     ` Mel Gorman
2015-07-20  8:49       ` Michal Hocko
2015-07-20 16:13       ` PINTU KUMAR
2015-07-20 17:55         ` Mel Gorman
2015-07-22 13:03           ` PINTU KUMAR
2015-07-22 14:05             ` Mel Gorman
2015-07-29  5:11               ` PINTU KUMAR
2015-07-29 12:07                 ` Johannes Weiner
  -- strict thread matches above, loose matches on Subject: below --
2015-07-06 17:23 [PATCH " Pintu Kumar
2015-07-06 18:38 Pintu Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5596C48F.8050800@gmx.de \
    --to=xypron.glpk@gmx.de \
    --cc=akpm@linux-foundation.org \
    --cc=cj@linux.com \
    --cc=corbet@lwn.net \
    --cc=cpgs@samsung.com \
    --cc=ddstreet@ieee.org \
    --cc=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=emunson@akamai.com \
    --cc=gorcunov@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=iqbal.ams@samsung.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=koct9i@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    --cc=opensource.ganesh@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pintu.k@samsung.com \
    --cc=pintu_agarwal@yahoo.com \
    --cc=prarit@redhat.com \
    --cc=rientjes@google.com \
    --cc=rohit.kr@samsung.com \
    --cc=rostedt@goodmis.org \
    --cc=sasha.levin@oracle.com \
    --cc=standby24x7@gmail.com \
    --cc=tj@kernel.org \
    --cc=uobergfe@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov@parallels.com \
    --cc=vinmenon@codeaurora.org \
    --cc=vishnu.ps@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).