public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo@kvack.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, penberg@cs.helsinki.fi,
	paulmck@us.ibm.com, nickpiggin@yahoo.com.au, tytso@mit.edu,
	dgc@sgi.com, ak@suse.de
Subject: Re: [RFC 0/4] Object reclaim via the slab allocator V1
Date: Mon, 3 Jul 2006 20:11:03 -0300	[thread overview]
Message-ID: <20060703231103.GA5160@dmt> (raw)
In-Reply-To: <Pine.LNX.4.64.0607031058400.26397@schroedinger.engr.sgi.com>

On Mon, Jul 03, 2006 at 11:10:15AM -0700, Christoph Lameter wrote:
> On Sun, 2 Jul 2006, Marcelo Tosatti wrote:
> 
> > > > I was, of course, referring to the unpleasant requirement that the object
> > > > layout start with an atomic_t.
> > > 
> > > Is that such a problem? It reduces the amount of indirect function calls 
> > > needed.
> > 
> > Need to benchmark I guess.
> 
> I think is pretty obvious. With atomic refcounters you can simply scan
> a slab for unused objects without any callbacks. Needing a callback for 
> every single object is a waste of resources and will limit reclaim 
> efficiency. You would have to do 120 callbacks on some slabs just to 
> figure out that it is worth trying to free objects in that 
> particular slab block.

Inline the callbacks into a per-cache kmem_cache_reclaim ?

> > I agree with Andrew, an atomic counter to indicate usage of the objects
> > is too simplistic (other than being unpleasant).
> 
> Cannot see a valid reason so far to draw that conclusion. With the right 
> convention the atomic refcounter can be used as an indicator that the 
> object is being freed (refcnt = 0), not in use (refcnt = 1) or in active 
> use (refcnt=2). The easy and efficient access to this kind of knowledge 
> about an object is essential for reclaim.

But the assumption that "refcnt = 1 implies unused object" is too weak.

For example, 

struct dentry {
        atomic_t d_count;
        unsigned int d_flags;           /* protected by d_lock */

d_count can be higher than one _and_ the object freeable. Think of
workloads operating on a large number of directories.

Andrew mentioned:

"That seems like quite a drawback. A single refcount=2 object on the
page means that nothing gets freed from that page at all. It'd be easy
(especially with dcache) to do tons of work without achieving anything."

> > > > How do you propose handling the locking?  dcache is passed a bare pointer
> > > > and no locks are held, but it is guaranteed that the object won't be freed
> > > > while it's playing with it?
> > > 
> > > The reference counter can be used to insure that the object is not freed.
> > 
> > Locking is pretty tricky...
> 
> Not at all. You do an atomic_inc_if_not_zero from the destructor and then 
> either will hold the object to good or you were unable to increment the 
> refcount and therefore you can give up and return because the object 
> is already being freed.

And might need locking too.

> The tricky locking part comes later when the destructor has to establish 
> the locks to get all links to the object released.

Right.

> 
> > What it does is to create a small set callbacks to invoke higher-level
> > code:
> > 
> > +struct slab_reclaim_ops {
> > +	int (*objp_is_freeable)(void *objp);
> > +	int (*objp_release)(void *objp);
> > +	int (*objp_lock)(void *objp);
> > +	int (*objp_unlock)(void *objp);
> > +};
> > 
> > Which are used by generic SLAB code to check for, and release, fully
> > freeable pages (you've seen it before, from last year). It contains a
> > dcache implementation.
> 
> Ok. I will have a look at that. But these callbacks are too heavy for my 
> taste. A refcounter could avoid all of that.

Inline them.

> > You really need cache specific information, such as the tree nature of
> > dentry references.
> 
> Only the cache knows that. The cache provides a destructo. In that 
> destructor the tree nature of the dentries can be considered. The 
> destructor knows about the tree nature of the slab and that can navigate 
> through the dependency structures that the object may be a part of.
> 
> > Christoph, I'm afraid that not using the LRU information can be harmful
> > to certain workloads... 
> 
> It can be very beneficial too since removing the LRU will avoid locking in 
> certain situations and it will shrink the objects. We already have lists 
> of all the objects in the slab. The LRU is redundant in some way.

Its likely that the logic should consider hints about recent object
usage, for example DCACHE_REFERENCED for dentries. The "callbacks" are
interesting for that purpose too.

> Of course there is the challenge of preserving the LRU like behavior using 
> the slab lists. But I think it may be sufficiently approximated by the 
> LRU ness of the used slab list and the push back to the partial lists 
> whenever we touch a slab during reclaim (we free some objects so the slab 
> has to move).

Well, individual object usage is not reflected at all in the slab lists,
is it?

  reply	other threads:[~2006-07-04  1:11 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-19 18:46 [RFC 0/4] Object reclaim via the slab allocator V1 Christoph Lameter
2006-06-19 18:46 ` [RFC 1/4] slab freeing consolidation Christoph Lameter
2006-06-22 19:18   ` Pekka J Enberg
2006-06-22 19:38     ` Christoph Lameter
2006-06-19 18:47 ` [RFC 2/4] Remove empty destructor from drivers/usb/mon/mon_text.c Christoph Lameter
2006-06-22 19:22   ` Pekka J Enberg
2006-06-19 18:47 ` [RFC 3/4] Add checks to current destructor uses Christoph Lameter
2006-06-22 19:24   ` Pekka J Enberg
2006-06-22 19:40     ` Christoph Lameter
2006-06-19 18:47 ` [PATCH 4/4] Slab Reclaim logic Christoph Lameter
2006-06-19 18:53   ` Christoph Lameter
2006-06-22 19:34   ` Pekka J Enberg
2006-06-22 19:42     ` Christoph Lameter
2006-06-22 19:46       ` Pekka J Enberg
2006-06-22 19:49         ` Christoph Lameter
2006-06-22 19:41   ` Pekka J Enberg
2006-06-22 19:44     ` Christoph Lameter
2006-06-22 19:46       ` Pekka J Enberg
2006-06-22 19:49       ` Pekka J Enberg
2006-06-22 19:52         ` Christoph Lameter
2006-06-19 20:50 ` [RFC 0/4] Object reclaim via the slab allocator V1 Andi Kleen
2006-06-29  0:43 ` Andrew Morton
2006-06-29  0:47   ` Christoph Lameter
2006-06-29  3:09     ` Andrew Morton
2006-06-29 17:20       ` Christoph Lameter
2006-07-03  0:44         ` Marcelo Tosatti
2006-07-03 18:10           ` Christoph Lameter
2006-07-03 23:11             ` Marcelo Tosatti [this message]
2006-07-04  1:46               ` Christoph Lameter
2006-07-04  2:07                 ` Marcelo Tosatti
2006-07-04  2:37                   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060703231103.GA5160@dmt \
    --to=marcelo@kvack.org \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=clameter@sgi.com \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=paulmck@us.ibm.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox