From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx151.postini.com [74.125.245.151]) by kanga.kvack.org (Postfix) with SMTP id 7664F6B0033 for ; Thu, 20 Jun 2013 11:23:33 -0400 (EDT) Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 20 Jun 2013 09:19:53 -0600 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id E1C9838C804A for ; Thu, 20 Jun 2013 11:19:43 -0400 (EDT) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r5KFJh1848300032 for ; Thu, 20 Jun 2013 11:19:44 -0400 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r5KFJ6H6031971 for ; Thu, 20 Jun 2013 09:19:06 -0600 Date: Thu, 20 Jun 2013 10:18:59 -0500 From: Seth Jennings Subject: Re: [PATCH 1/2] zswap: limit pool fragment Message-ID: <20130620151859.GC9461@cerebellum> References: <1371739102-11436-1-git-send-email-bob.liu@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1371739102-11436-1-git-send-email-bob.liu@oracle.com> Sender: owner-linux-mm@kvack.org List-ID: To: Bob Liu Cc: akpm@linux-foundation.org, linux-mm@kvack.org, konrad.wilk@oracle.com, Bob Liu On Thu, Jun 20, 2013 at 10:38:22PM +0800, Bob Liu wrote: > If zswap pool fragment is heavy, it's meanless to store more pages to zswap. > So refuse allocate page to zswap pool to limit the fragmentation. This is a little light on specifics. >>From looking at the code it would seem that you are preventing further growth of the zswap pool if the overall compression ratio achieved in the pool doesn't reach a certain threshold; in this case an arbitrary 70%. So there are some issues: 1. If the first few pages that come into zswap don't compress well this logic makes a horrible assumption that all future page will also not compress well and effectively disables zswap until those poorly-compressed pages are removed from the pool. So that's a huge problem. 2. It mucks up the allocator API with a change to zbud_alloc's signature 3. It introduces yet another (should be) tunable in the form of the compression threshold and really just reimplements the per-page compression threshold that was removed from the code during development (the per-page check was better than this because it doesn't suffer from issue #1). It was decided that the decisions about whether the page can be (efficiently) stored should be the job of the allocator. Seth > > Signed-off-by: Bob Liu > --- > include/linux/zbud.h | 2 +- > mm/zbud.c | 4 +++- > mm/zswap.c | 15 +++++++++++++-- > 3 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/include/linux/zbud.h b/include/linux/zbud.h > index 2571a5c..71a61be 100644 > --- a/include/linux/zbud.h > +++ b/include/linux/zbud.h > @@ -12,7 +12,7 @@ struct zbud_ops { > struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops); > void zbud_destroy_pool(struct zbud_pool *pool); > int zbud_alloc(struct zbud_pool *pool, int size, gfp_t gfp, > - unsigned long *handle); > + unsigned long *handle, bool dis_pagealloc); > void zbud_free(struct zbud_pool *pool, unsigned long handle); > int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries); > void *zbud_map(struct zbud_pool *pool, unsigned long handle); > diff --git a/mm/zbud.c b/mm/zbud.c > index 9bb4710..5ace447 100644 > --- a/mm/zbud.c > +++ b/mm/zbud.c > @@ -248,7 +248,7 @@ void zbud_destroy_pool(struct zbud_pool *pool) > * a new page. > */ > int zbud_alloc(struct zbud_pool *pool, int size, gfp_t gfp, > - unsigned long *handle) > + unsigned long *handle, bool dis_pagealloc) > { > int chunks, i, freechunks; > struct zbud_header *zhdr = NULL; > @@ -279,6 +279,8 @@ int zbud_alloc(struct zbud_pool *pool, int size, gfp_t gfp, > > /* Couldn't find unbuddied zbud page, create new one */ > spin_unlock(&pool->lock); > + if (dis_pagealloc) > + return -ENOSPC; > page = alloc_page(gfp); > if (!page) > return -ENOMEM; > diff --git a/mm/zswap.c b/mm/zswap.c > index deda2b6..7fe2b1b 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -607,10 +607,12 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, > struct zswap_entry *entry, *dupentry; > int ret; > unsigned int dlen = PAGE_SIZE, len; > - unsigned long handle; > + unsigned long handle, stored_pages; > char *buf; > u8 *src, *dst; > struct zswap_header *zhdr; > + u64 tmp; > + bool dis_pagealloc = false; > > if (!tree) { > ret = -ENODEV; > @@ -645,10 +647,19 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, > goto freepage; > } > > + /* If the fragment of zswap pool is heavy, don't alloc new page to > + * zswap pool anymore. The limitation of fragment is 70% percent currently > + */ > + stored_pages = atomic_read(&zswap_stored_pages); > + tmp = zswap_pool_pages * 100; > + do_div(tmp, stored_pages + 1); > + if (tmp > 70) > + dis_pagealloc = true; > + > /* store */ > len = dlen + sizeof(struct zswap_header); > ret = zbud_alloc(tree->pool, len, __GFP_NORETRY | __GFP_NOWARN, > - &handle); > + &handle, dis_pagealloc); > if (ret == -ENOSPC) { > zswap_reject_compress_poor++; > goto freepage; > -- > 1.7.10.4 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org