Re: [PATCH 1/2] zswap: limit pool fragment

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Seth Jennings <sjenning@linux.vnet.ibm.com>
To: Bob Liu <lliubbo@gmail.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	konrad.wilk@oracle.com, Bob Liu <bob.liu@oracle.com>
Subject: Re: [PATCH 1/2] zswap: limit pool fragment
Date: Thu, 20 Jun 2013 10:18:59 -0500	[thread overview]
Message-ID: <20130620151859.GC9461@cerebellum> (raw)
In-Reply-To: <1371739102-11436-1-git-send-email-bob.liu@oracle.com>

On Thu, Jun 20, 2013 at 10:38:22PM +0800, Bob Liu wrote:
> If zswap pool fragment is heavy, it's meanless to store more pages to zswap.
> So refuse allocate page to zswap pool to limit the fragmentation.

This is a little light on specifics.

>From looking at the code it would seem that you are preventing further
growth of the zswap pool if the overall compression ratio achieved in the
pool doesn't reach a certain threshold; in this case an arbitrary 70%.

So there are some issues:

1. If the first few pages that come into zswap don't compress well this logic
makes a horrible assumption that all future page will also not compress well
and effectively disables zswap until those poorly-compressed pages are removed
from the pool.  So that's a huge problem.

2. It mucks up the allocator API with a change to zbud_alloc's signature

3. It introduces yet another (should be) tunable in the form of the compression
threshold and really just reimplements the per-page compression threshold that
was removed from the code during development (the per-page check was better
than this because it doesn't suffer from issue #1).  It was decided that the
decisions about whether the page can be (efficiently) stored should be the job
of the allocator.

Seth

> 
> Signed-off-by: Bob Liu <bob.liu@oracle.com>
> ---
>  include/linux/zbud.h |    2 +-
>  mm/zbud.c            |    4 +++-
>  mm/zswap.c           |   15 +++++++++++++--
>  3 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/zbud.h b/include/linux/zbud.h
> index 2571a5c..71a61be 100644
> --- a/include/linux/zbud.h
> +++ b/include/linux/zbud.h
> @@ -12,7 +12,7 @@ struct zbud_ops {
>  struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops);
>  void zbud_destroy_pool(struct zbud_pool *pool);
>  int zbud_alloc(struct zbud_pool *pool, int size, gfp_t gfp,
> -	unsigned long *handle);
> +	unsigned long *handle, bool dis_pagealloc);
>  void zbud_free(struct zbud_pool *pool, unsigned long handle);
>  int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries);
>  void *zbud_map(struct zbud_pool *pool, unsigned long handle);
> diff --git a/mm/zbud.c b/mm/zbud.c
> index 9bb4710..5ace447 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -248,7 +248,7 @@ void zbud_destroy_pool(struct zbud_pool *pool)
>   * a new page.
>   */
>  int zbud_alloc(struct zbud_pool *pool, int size, gfp_t gfp,
> -			unsigned long *handle)
> +			unsigned long *handle, bool dis_pagealloc)
>  {
>  	int chunks, i, freechunks;
>  	struct zbud_header *zhdr = NULL;
> @@ -279,6 +279,8 @@ int zbud_alloc(struct zbud_pool *pool, int size, gfp_t gfp,
> 
>  	/* Couldn't find unbuddied zbud page, create new one */
>  	spin_unlock(&pool->lock);
> +	if (dis_pagealloc)
> +		return -ENOSPC;
>  	page = alloc_page(gfp);
>  	if (!page)
>  		return -ENOMEM;
> diff --git a/mm/zswap.c b/mm/zswap.c
> index deda2b6..7fe2b1b 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -607,10 +607,12 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>  	struct zswap_entry *entry, *dupentry;
>  	int ret;
>  	unsigned int dlen = PAGE_SIZE, len;
> -	unsigned long handle;
> +	unsigned long handle, stored_pages;
>  	char *buf;
>  	u8 *src, *dst;
>  	struct zswap_header *zhdr;
> +	u64 tmp;
> +	bool dis_pagealloc = false;
> 
>  	if (!tree) {
>  		ret = -ENODEV;
> @@ -645,10 +647,19 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>  		goto freepage;
>  	}
> 
> +	/* If the fragment of zswap pool is heavy, don't alloc new page to
> +	 * zswap pool anymore. The limitation of fragment is 70% percent currently
> +	 */
> +	stored_pages = atomic_read(&zswap_stored_pages);
> +	tmp = zswap_pool_pages * 100;
> +	do_div(tmp, stored_pages + 1);
> +	if (tmp > 70)
> +		dis_pagealloc = true;
> +
>  	/* store */
>  	len = dlen + sizeof(struct zswap_header);
>  	ret = zbud_alloc(tree->pool, len, __GFP_NORETRY | __GFP_NOWARN,
> -		&handle);
> +		&handle, dis_pagealloc);
>  	if (ret == -ENOSPC) {
>  		zswap_reject_compress_poor++;
>  		goto freepage;
> -- 
> 1.7.10.4
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-06-20 15:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-20 14:38 [PATCH 1/2] zswap: limit pool fragment Bob Liu
2013-06-20 15:18 ` Seth Jennings [this message]
2013-06-21  0:00   ` Bob Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130620151859.GC9461@cerebellum \
    --to=sjenning@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bob.liu@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=lliubbo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).