linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Andi Kleen <andi@firstfloor.org>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Mel Gorman <mel@skynet.ie>,
	mpm@selenic.com, Matthew Wilcox <matthew@wil.cx>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Subject: Re: [patch 21/21] slab defrag: Obsolete SLAB
Date: Wed, 14 May 2008 10:29:41 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0805141009210.15194@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <84144f020805120054t1370236ei5ff52279457e026e@mail.gmail.com>

On Mon, 12 May 2008, Pekka Enberg wrote:

> Christoph fixed a tbench regression that was in the same ballpark as
> the TPC regression reported by Matthew which is why we've asked the
> Intel folks to re-test. But yeah, we're working on it.

I suspect that the TPC regression was due to the page allocator order 0 
inefficiencies like the tbench regression but we have no data yet to 
establish that.

Fundamentally there is no way to avoid complex queueing on free() unless 
one directly frees the object. This is serialized in SLUB by taking a page 
lock. If we can establish that the object is from the current cpu slab 
then no lock is taken because the slab is reserved for the current 
processor. So the bad case is a free of a object with a long life span or 
an object freed on a remote processor.

Howver, the "slow" case in SLUB is still much less complex 
than comparable processing in SLAB. It is quite fast.

SLAB freeing can avoid taking a lock if

1. We can establish that the object is node local (trivial if !NUMA 
otherwise we need to get the node information from the page struct and 
compare to the current node).

2. There is space in the per cpu queue

If the object is *not* node local then we have to take an alien lock for 
the remote node in order to put the object in an alien queue. That is much 
less efficient than the SLUB case. SLAB then needs to run the cache reaper 
to expire these object into the remote nodes queues (later the cache 
reaper may then actually free these objects). This management overhead 
does not exist in SLUB. The cache reaper causes processors to not be 
available for short time frames (the reaper scans through all slab 
caches!) which in turn cause regression in applications that need to 
respond in a short time frame (HPC appls, network applications that are 
timing critical).

Note that the lock granularity in SLUB is finer than the locks in SLAB. 
SLUB can concurrently free multiple objects to the same remote node etc 
etc. If the objects belong to different slabs then there is no dirtying of 
any shared cachelines.

The main issue for SLAB vs. SLUB on free is likely the !NUMA case in which 
SLAB can avoid the overhead of the node check (which does not exist in 
SLUB) and in which case we can always immediately batch the object (if 
there is space). The additional overhead in SLUB is mainly one 
atomic instruction over the SLAB fastpath.

So I think that the free need to  stay as is. The disadvantages in terms 
of the complexity of handling the objects and expiring them and the issue 
of having to take per node locks in SLAB makes it hard to justify adding a 
queue for free in SLUB. Maybe someone has an inspiration on how to do this 
effective that is better than my attempts which always ultimately ended 
implementing code that thad the same issues that we have in SLAB.

  parent reply	other threads:[~2008-05-14 17:29 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-10  3:08 [patch 00/21] Slab Fragmentation Reduction V12 Christoph Lameter
2008-05-10  3:08 ` [patch 01/21] slub: Add defrag_ratio field and sysfs support Christoph Lameter
2008-05-10  3:08 ` [patch 02/21] slub: Replace ctor field with ops field in /sys/slab/* Christoph Lameter
2008-05-10  3:08 ` [patch 03/21] slub: Add get() and kick() methods Christoph Lameter
2008-05-10  3:08 ` [patch 04/21] slub: Sort slab cache list and establish maximum objects for defrag slabs Christoph Lameter
2008-05-10  3:08 ` [patch 05/21] slub: Slab defrag core Christoph Lameter
2008-05-10  3:08 ` [patch 06/21] slub: Add KICKABLE to avoid repeated kick() attempts Christoph Lameter
2008-05-10  3:08 ` [patch 07/21] slub: Extend slabinfo to support -D and -F options Christoph Lameter
2008-05-10  3:08 ` [patch 08/21] slub: add defrag statistics Christoph Lameter
2008-05-10  3:08 ` [patch 09/21] slub: Trigger defragmentation from memory reclaim Christoph Lameter
2008-05-10  3:08 ` [patch 10/21] buffer heads: Support slab defrag Christoph Lameter
2008-05-12  0:24   ` David Chinner
2008-05-15 17:42     ` Christoph Lameter
2008-05-15 23:10       ` David Chinner
2008-05-16 17:01         ` Christoph Lameter
2008-05-19  5:45           ` David Chinner
2008-05-19 16:44             ` Christoph Lameter
2008-05-20  0:25               ` David Chinner
2008-05-20  6:56                 ` Evgeniy Polyakov
2008-05-20 21:46                   ` David Chinner
2008-05-20 22:25                     ` Evgeniy Polyakov
2008-05-20 23:19                       ` David Chinner
2008-05-20 23:28                         ` Andrew Morton
2008-05-21  6:15                           ` Evgeniy Polyakov
2008-05-21  6:24                             ` Andrew Morton
2008-05-21 17:52                               ` iput() in reclaim context Hugh Dickins
2008-05-21 17:58                                 ` Evgeniy Polyakov
2008-05-21 18:12                                 ` Andrew Morton
2008-05-20 23:22                       ` [patch 10/21] buffer heads: Support slab defrag Evgeniy Polyakov
2008-05-20 23:30                         ` David Chinner
2008-05-21  6:20                           ` Evgeniy Polyakov
2008-05-21  1:56                         ` Christoph Lameter
2008-05-20 22:53             ` Jamie Lokier
2008-05-10  3:08 ` [patch 11/21] inodes: Support generic defragmentation Christoph Lameter
2008-05-10  3:08 ` [patch 12/21] Filesystem: Ext2 filesystem defrag Christoph Lameter
2008-05-10  3:08 ` [patch 13/21] Filesystem: Ext3 " Christoph Lameter
2008-05-10  3:08 ` [patch 14/21] Filesystem: Ext4 " Christoph Lameter
2008-05-10  3:08 ` [patch 15/21] Filesystem: XFS slab defragmentation Christoph Lameter
2008-05-10  6:55   ` Christoph Hellwig
2008-05-10  3:08 ` [patch 16/21] Filesystem: /proc filesystem support for slab defrag Christoph Lameter
2008-05-10  3:08 ` [patch 17/21] Filesystem: Slab defrag: Reiserfs support Christoph Lameter
2008-05-10  3:08 ` [patch 18/21] Filesystem: Socket inode defragmentation Christoph Lameter
2008-05-13 13:28   ` Evgeniy Polyakov
2008-05-15 17:40     ` Christoph Lameter
2008-05-15 18:23       ` Evgeniy Polyakov
2008-05-10  3:08 ` [patch 19/21] dentries: Add constructor Christoph Lameter
2008-05-10  3:08 ` [patch 20/21] dentries: dentry defragmentation Christoph Lameter
2008-05-10  3:08 ` [patch 21/21] slab defrag: Obsolete SLAB Christoph Lameter
2008-05-10  9:53   ` Andi Kleen
2008-05-11  2:15     ` Rik van Riel
2008-05-12  7:38       ` KOSAKI Motohiro
2008-05-12  7:54         ` Pekka Enberg
2008-05-12 10:08           ` Andi Kleen
2008-05-12 10:23             ` Pekka Enberg
2008-05-14 17:30               ` Christoph Lameter
2008-05-14 17:29           ` Christoph Lameter [this message]
2008-05-14 17:49             ` Andi Kleen
2008-05-14 18:03               ` Christoph Lameter
2008-05-14 18:18                 ` Matt Mackall
2008-05-14 19:21                   ` Christoph Lameter
2008-05-14 19:49                     ` Matt Mackall
2008-05-14 20:33                       ` Christoph Lameter
2008-05-14 21:02                         ` Matt Mackall
2008-05-14 21:26                           ` Christoph Lameter
2008-05-14 21:54                             ` Matt Mackall
2008-05-15 17:15                               ` Christoph Lameter
2008-05-15  3:26                 ` Zhang, Yanmin
2008-05-15 17:05                   ` Christoph Lameter
2008-05-15 17:49                     ` Matthew Wilcox
2008-05-15 17:58                       ` Christoph Lameter
2008-05-15 18:13                         ` Matthew Wilcox
2008-05-15 18:43                           ` Christoph Lameter
2008-05-15 18:51                             ` Matthew Wilcox
2008-05-15 19:09                               ` Christoph Lameter
2008-05-15 19:29                                 ` Matthew Wilcox
2008-05-15 20:14                                   ` Matthew Wilcox
2008-05-15 20:30                                     ` Pekka Enberg
2008-05-16 19:17                                     ` Christoph Lameter
2008-05-16 19:06                                   ` Christoph Lameter
2008-05-15 18:19                       ` Eric Dumazet
2008-05-15 18:29                       ` Vegard Nossum
2008-05-16  5:16                     ` Zhang, Yanmin
2008-05-14 18:05               ` Christoph Lameter
2008-05-14 20:46                 ` Christoph Lameter
2008-05-14 20:58                   ` Matthew Wilcox
2008-05-14 21:00                     ` Christoph Lameter
2008-05-14 21:21                       ` Matthew Wilcox
2008-05-14 21:33                         ` Christoph Lameter
2008-05-14 21:43                           ` Matthew Wilcox
2008-05-14 21:53                             ` Christoph Lameter
2008-05-14 22:00                               ` Matthew Wilcox
2008-05-14 22:32                                 ` Christoph Lameter
2008-05-14 22:34                                 ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0805141009210.15194@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=mel@skynet.ie \
    --cc=mpm@selenic.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).