linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Magenheimer <dan.magenheimer@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>,
	Seth Jennings <sjenning@linux.vnet.ibm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nitin Gupta <ngupta@vflare.org>, Minchan Kim <minchan@kernel.org>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@us.ibm.com>, Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <jweiner@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Larry Woodman <lwoodman@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	devel@driverdev.osuosl.org
Subject: RE: [PATCH 7/8] zswap: add to mm/
Date: Thu, 3 Jan 2013 14:37:01 -0800 (PST)	[thread overview]
Message-ID: <ac37f7ce-b15a-40f8-9da7-858dea3651b9@default> (raw)
In-Reply-To: <20130103073339.GF3120@dastard>

> From: Dave Chinner [mailto:david@fromorbit.com]
> Subject: Re: [PATCH 7/8] zswap: add to mm/
> 
> <much useful info from Dave deleted>

OK, I have suitably proven how little I know about slab
and have received some needed education from your
response... Thanks for that Dave.

So let me ask some questions instead of making
stupid assumptions.

> Thinking that there is a fixed amount of memory that you should
> reserve for some subsystem is simply the wrong approach to take.
> caches are dynamic and the correct system balance should result of
> the natural behaviour of the reclaim algorithms.
>
> The shrinker infrastructure doesn't set any set size goals - it
> simply tries to balance the reclaim across all the shrinkers and
> relative to the page cache... 

First, it's important to note that zcache/zswap is not
really a subsystem.  It's simply a way of increasing
the number of anonymous pages (zswap and zcache) and
pagecache pages (zcache only) in RAM by using compression.
Because compressed pages can't be byte-addressed directly,
pages enter zcache/zswap through a "transformation"
process I've likened to a Fourier transform:  In
their compressed state, they must be managed differently
than normal whole pages.  Compressed anonymous pages must
transition back to uncompressed before they can be used.
Compressed pagecache pages (zcache only) can be either
uncompressed when needed or gratuitously discarded (eventually)
when not needed.

So I've been proceeding with the assumption that it is the
sum of wholepages used by both compressed-anonymous pages
and uncompressed-anonymous pages that must be managed/balanced,
and that this sum should be managed similarly to the non-zxxxx
case of the total number of anonymous pages in the system
(and similarly for compressed+uncompressed pagecache pages).

Are you suggesting that slab can/should be used instead?

> And so the two subsystems need different reclaim implementations.
> And, well, that's exactly what we have shrinkers for - implmenting
> subsystem specific reclaim policy. The shrinker infrastructure is
> responsible for them keeping balance between all the caches that
> have shrinkers and the size of the page cache...

Given the above, do you think either compressed-anonymous-pages or
compressed-pagecache-pages are suitable candidates for the shrinker
infrastructure?

Note that compressed anonymous pages are always dirty so
cannot be "reclaimed" as such.  But the mechanism that Seth
and I are working on causes compressed anonymous pages to
be decompressed and then sent to backing store, which does
(eventually, after I/O latency) free up pageframes.

Currently zcache does use the shrinker API for reclaiming
pageframes-used-for-compressed-pagecache-pages.  Since
these _are_ a form of pagecache pages, is the shrinker suitable?
 
> There are also cases where we've moved metadata caches out of the
> page cache into shrinker controlled caches because the page cache
> reclaim is too simplistic to handle the complex relationships
> between filesystem metadata. We've done this in XFS, and IIRC btrfs
> did this recently as well...

So although the objects in zswap/zcache are less than one page,
they are still "data" not "metadata", true?  In your opinion,
then, should they be managed by core MM, or by shrinker-controlled
caches, by some combination, or independently of either?

> > In any case, I would posit that both the nature of zpages and their
> > average size relative to a whole page is quite unusual compared to slab.
> 
> Doesn't sound at all unusual.

I think I've addressed the different "nature" above, so let
me ask about the size...

Can slab today suitably manage "larger" objects that exceed
half-PAGESIZE?  Or "larger" objects, such as 35%-PAGESIZE where
there would be a great deal of fragmentation?

If so, we should definitely consider slab as an alternative
for zpage allocation.

> > So while there may be some useful comparisons between zswap
> > and slab, the differences may warrant dramatically different policy.
> 
> There may be differences, but it doesn't sound like there's anything
> you can't implment with an appropriate shrinker implmentation....

Depending on your answers above, we may definitely need to
consider that as well!

Thanks,
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-01-03 22:37 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <<1355262966-15281-1-git-send-email-sjenning@linux.vnet.ibm.com>
     [not found] ` <<1355262966-15281-8-git-send-email-sjenning@linux.vnet.ibm.com>
2012-12-31 23:06   ` [PATCH 7/8] zswap: add to mm/ Dan Magenheimer
2013-01-01 17:52     ` Seth Jennings
2013-01-02 15:55       ` Dave Hansen
2013-01-02 17:26         ` Dan Magenheimer
2013-01-02 18:17           ` Dave Hansen
2013-01-02 19:04             ` Dan Magenheimer
2013-01-03  7:33               ` Dave Chinner
2013-01-03 22:37                 ` Dan Magenheimer [this message]
2013-01-04  2:30                   ` Dave Chinner
2013-01-04 15:55                     ` Seth Jennings
2013-01-04 18:45                     ` Dan Magenheimer
2013-01-22 23:58                 ` High slab usage testing with zcache/zswap (Was: [PATCH 7/8] zswap: add to mm/) Dan Magenheimer
2013-01-02 22:44         ` [PATCH 7/8] zswap: add to mm/ Seth Jennings
2013-01-02 17:08       ` Dan Magenheimer
2013-01-02 23:25         ` Seth Jennings
2013-01-03 22:33           ` Dan Magenheimer
2013-01-04 15:42             ` Seth Jennings
2013-01-04 22:45               ` Dan Magenheimer
2013-01-07 14:47                 ` Seth Jennings
2012-12-11 21:55 [PATCH 0/8] zswap: compressed swap caching Seth Jennings
2012-12-11 21:55 ` [PATCH 1/8] staging: zsmalloc: add gfp flags to zs_create_pool Seth Jennings
2012-12-11 21:56 ` [PATCH 2/8] staging: zsmalloc: remove unsed pool name Seth Jennings
2012-12-11 21:56 ` [PATCH 3/8] staging: zsmalloc: add page alloc/free callbacks Seth Jennings
2012-12-11 21:56 ` [PATCH 4/8] staging: zsmalloc: make CLASS_DELTA relative to PAGE_SIZE Seth Jennings
2012-12-11 21:56 ` [PATCH 5/8] debugfs: add get/set for atomic types Seth Jennings
2012-12-11 21:56 ` [PATCH 6/8] zsmalloc: promote to lib/ Seth Jennings
2012-12-11 21:56 ` [PATCH 7/8] zswap: add to mm/ Seth Jennings
2013-01-03 16:07   ` Seth Jennings
2012-12-11 21:56 ` [PATCH 8/8] zswap: add documentation Seth Jennings
2012-12-11 22:01 ` [PATCH 0/8] zswap: compressed swap caching Greg Kroah-Hartman
2012-12-12 16:29   ` Seth Jennings
2012-12-12 17:27     ` Dan Magenheimer
2012-12-12 18:32       ` Seth Jennings
2012-12-12 18:36 ` Seth Jennings
2012-12-12 22:49 ` Luigi Semenzato
2012-12-12 23:46   ` Dan Magenheimer
2012-12-14 15:59   ` Seth Jennings
2013-01-03 16:01 ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac37f7ce-b15a-40f8-9da7-858dea3651b9@default \
    --to=dan.magenheimer@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jhopper@us.ibm.com \
    --cc=jweiner@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).