Re: [RFC 0/3] Slab Defrag / Slab Targeted Reclaim and general Slab API changes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eric Dumazet <dada1@cosmosbay.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dgc@sgi.com,
	Mel Gorman <mel@csn.ul.ie>
Subject: Re: [RFC 0/3] Slab Defrag / Slab Targeted Reclaim and general Slab API changes
Date: Sat, 05 May 2007 07:41:20 +0200	[thread overview]
Message-ID: <463C1900.7060409@cosmosbay.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0705042209050.14211@schroedinger.engr.sgi.com>

Christoph Lameter a écrit :
> On Sat, 5 May 2007, Eric Dumazet wrote:
> 
>>> C. Introduces a slab_ops structure that allows a slab user to provide
>>>    operations on slabs.
>> Could you please make it const ?
> 
> Sure. Done.

thanks :)

> 
>>> All of this is really not necessary since the compiler knows how to align
>>> structures and we should use this information instead of having the user
>>> specify an alignment. I would like to get rid of SLAB_HWCACHE_ALIGN
>>> and kmem_cache_create. Instead one would use the following macros (that
>>> then result in a call to __kmem_cache_create).
>> Hum, the problem is the compiler sometimes doesnt know the target processor
>> alignment.
>>
>> Adding ____cacheline_aligned to 'struct ...' definitions might be overkill if
>> you compile a generic kernel and happens to boot a Pentium III with it.
> 
> Then add ___cacheline_aligned_in_smp or specify the alignment in the 
> various other ways that exist. Practice is that most slabs specify 
> SLAB_HWCACHE_ALIGN. So most slabs are cache aligned today.

Yes but this alignement is dynamic, not at compile time.

include/asm-i386/processor.h:739:#define cache_line_size() 
(boot_cpu_data.x86_cache_alignment)

So adding ____cacheline_aligned  to 'struct file' for example would be a 
regression for people with PII or PIII

> 
>> G. Being able to track the number of pages in a kmem_cache
>>
>>
>> If you look at fs/buffer.c, you'll notice the bh_accounting, recalc_bh_state()
>> that might be overkill for large SMP configurations, when the real concern is
>> to be able to limit the bh's not to exceed 10% of LOWMEM.
>>
>> Adding a callback in slab_ops to track total number of pages in use by a given
>> kmem_cache would be good.
> 
> Such functionality exists internal to SLUB and in the reporting tool. 
> I can export that function if you need it.
> 
>> Same thing for fs/file_table.c : nr_file logic
>> (percpu_counter_dec()/percpu_counter_inc() for each file open/close) could be
>> simplified if we could just count the pages in use by filp_cachep kmem_cache.
>> The get_nr_files() thing is not worth the pain.
> 
> Sure. What exactly do you want? The absolute number of pages of memory 
> that the slab is using?
> 
> 	kmem_cache_pages_in_use(struct kmem_cache *) ?
> 
> The call will not be too lightweight since we will have to loop over all 
> nodes and add the counters in each per node struct for allocates slabs.
> 
> 

On a typical system, number of pages for 'filp' kmem_cache tends to be stable

-bash-2.05b# grep filp /proc/slabinfo
filp              234727 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    135
-bash-2.05b# grep filp /proc/slabinfo
filp              234776 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    168
-bash-2.05b# grep filp /proc/slabinfo
filp              234728 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    180
-bash-2.05b# grep filp /proc/slabinfo
filp              234724 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    174

So revert nr_files logic to a single integer would be enough, even for NUMA

int nr_pages_used_by_filp;
int nr_pages_filp_limit;
int filp_in_danger __read_mostly;

static void callback_pages_in_use_by_filp(int inc)
{
     int in_danger;

     nr_pages_used_by_filp += inc;

     in_danger = nr_pages_used_by_filp >= nr_pages_filp_limit;
     if (in_danger != filp_in_danger)
         filp_in_danger = in_danger;
}

struct file *get_empty_filp(void)
{
...
if (filp_in_danger && !capable(CAP_SYS_ADMIN))
	goto over;

...
}


void __init files_init(unsigned long mempages)
{
...
nr_pages_filp_limit = (mempages * 10) / 100; /* 10% for filp use */
...
}

WARNING: multiple messages have this Message-ID (diff)

From: Eric Dumazet <dada1@cosmosbay.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dgc@sgi.com,
	Mel Gorman <mel@csn.ul.ie>
Subject: Re: [RFC 0/3] Slab Defrag / Slab Targeted Reclaim and general Slab API changes
Date: Sat, 05 May 2007 07:41:20 +0200	[thread overview]
Message-ID: <463C1900.7060409@cosmosbay.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0705042209050.14211@schroedinger.engr.sgi.com>

Christoph Lameter a ecrit :
> On Sat, 5 May 2007, Eric Dumazet wrote:
> 
>>> C. Introduces a slab_ops structure that allows a slab user to provide
>>>    operations on slabs.
>> Could you please make it const ?
> 
> Sure. Done.

thanks :)

> 
>>> All of this is really not necessary since the compiler knows how to align
>>> structures and we should use this information instead of having the user
>>> specify an alignment. I would like to get rid of SLAB_HWCACHE_ALIGN
>>> and kmem_cache_create. Instead one would use the following macros (that
>>> then result in a call to __kmem_cache_create).
>> Hum, the problem is the compiler sometimes doesnt know the target processor
>> alignment.
>>
>> Adding ____cacheline_aligned to 'struct ...' definitions might be overkill if
>> you compile a generic kernel and happens to boot a Pentium III with it.
> 
> Then add ___cacheline_aligned_in_smp or specify the alignment in the 
> various other ways that exist. Practice is that most slabs specify 
> SLAB_HWCACHE_ALIGN. So most slabs are cache aligned today.

Yes but this alignement is dynamic, not at compile time.

include/asm-i386/processor.h:739:#define cache_line_size() 
(boot_cpu_data.x86_cache_alignment)

So adding ____cacheline_aligned  to 'struct file' for example would be a 
regression for people with PII or PIII

> 
>> G. Being able to track the number of pages in a kmem_cache
>>
>>
>> If you look at fs/buffer.c, you'll notice the bh_accounting, recalc_bh_state()
>> that might be overkill for large SMP configurations, when the real concern is
>> to be able to limit the bh's not to exceed 10% of LOWMEM.
>>
>> Adding a callback in slab_ops to track total number of pages in use by a given
>> kmem_cache would be good.
> 
> Such functionality exists internal to SLUB and in the reporting tool. 
> I can export that function if you need it.
> 
>> Same thing for fs/file_table.c : nr_file logic
>> (percpu_counter_dec()/percpu_counter_inc() for each file open/close) could be
>> simplified if we could just count the pages in use by filp_cachep kmem_cache.
>> The get_nr_files() thing is not worth the pain.
> 
> Sure. What exactly do you want? The absolute number of pages of memory 
> that the slab is using?
> 
> 	kmem_cache_pages_in_use(struct kmem_cache *) ?
> 
> The call will not be too lightweight since we will have to loop over all 
> nodes and add the counters in each per node struct for allocates slabs.
> 
> 

On a typical system, number of pages for 'filp' kmem_cache tends to be stable

-bash-2.05b# grep filp /proc/slabinfo
filp              234727 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    135
-bash-2.05b# grep filp /proc/slabinfo
filp              234776 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    168
-bash-2.05b# grep filp /proc/slabinfo
filp              234728 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    180
-bash-2.05b# grep filp /proc/slabinfo
filp              234724 374100    256   15    1 : tunables  120   60    8 : 
slabdata  24940  24940    174

So revert nr_files logic to a single integer would be enough, even for NUMA

int nr_pages_used_by_filp;
int nr_pages_filp_limit;
int filp_in_danger __read_mostly;

static void callback_pages_in_use_by_filp(int inc)
{
     int in_danger;

     nr_pages_used_by_filp += inc;

     in_danger = nr_pages_used_by_filp >= nr_pages_filp_limit;
     if (in_danger != filp_in_danger)
         filp_in_danger = in_danger;
}

struct file *get_empty_filp(void)
{
...
if (filp_in_danger && !capable(CAP_SYS_ADMIN))
	goto over;

...
}


void __init files_init(unsigned long mempages)
{
...
nr_pages_filp_limit = (mempages * 10) / 100; /* 10% for filp use */
...
}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2007-05-05  5:41 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-04 22:15 [RFC 0/3] Slab Defrag / Slab Targeted Reclaim and general Slab API changes clameter
2007-05-04 22:15 ` clameter
2007-05-04 22:15 ` [RFC 1/3] SLUB: slab_ops instead of constructors / destructors clameter
2007-05-04 22:15   ` clameter
2007-05-05 10:14   ` Pekka Enberg
2007-05-05 10:14     ` Pekka Enberg
2007-05-05 15:43     ` Christoph Lameter
2007-05-05 15:43       ` Christoph Lameter
2007-05-06 19:19   ` Bert Wesarg
2007-05-06 19:19     ` Bert Wesarg
2007-05-06 19:46     ` Satyam Sharma
2007-05-06 19:46       ` Satyam Sharma
2007-05-04 22:15 ` [RFC 2/3] SLUB: Implement targeted reclaim and partial list defragmentation clameter
2007-05-04 22:15   ` clameter
2007-05-04 23:03   ` Christoph Lameter
2007-05-04 23:03     ` Christoph Lameter
2007-05-05  1:04     ` Randy Dunlap
2007-05-05  1:04       ` Randy Dunlap
2007-05-05  1:07       ` Christoph Lameter
2007-05-05  1:07         ` Christoph Lameter
2007-05-05  5:32   ` William Lee Irwin III
2007-05-05  5:32     ` William Lee Irwin III
2007-05-05 15:35     ` Christoph Lameter
2007-05-05 15:35       ` Christoph Lameter
2007-05-05 10:38   ` Andi Kleen
2007-05-05 10:38     ` Andi Kleen
2007-05-05 15:42     ` Christoph Lameter
2007-05-05 15:42       ` Christoph Lameter
2007-05-05 17:11       ` Andi Kleen
2007-05-05 17:11         ` Andi Kleen
2007-05-09 15:05   ` Mel Gorman
2007-05-09 15:05     ` Mel Gorman
2007-05-09 16:34     ` Christoph Lameter
2007-05-09 16:34       ` Christoph Lameter
2007-05-04 22:15 ` [RFC 3/3] Support targeted reclaim and slab defrag for dentry cache clameter
2007-05-04 22:15   ` clameter
2007-05-05  5:07 ` [RFC 0/3] Slab Defrag / Slab Targeted Reclaim and general Slab API changes Eric Dumazet
2007-05-05  5:07   ` Eric Dumazet
2007-05-05  5:14   ` Christoph Lameter
2007-05-05  5:14     ` Christoph Lameter
2007-05-05  5:41     ` Eric Dumazet [this message]
2007-05-05  5:41       ` Eric Dumazet
2007-05-05  7:37       ` Eric Dumazet
2007-05-05  7:37         ` Eric Dumazet
2007-05-05 15:39       ` Christoph Lameter
2007-05-05 15:39         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=463C1900.7060409@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=clameter@sgi.com \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.