Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.
       [not found] ` <20080317230529.474353536@sgi.com>
@ 2008-03-18 18:54   ` Pekka Enberg
  2008-03-18 19:00     ` Christoph Lameter
  2008-03-20  6:44   ` Zhang, Yanmin
  1 sibling, 1 reply; 23+ messages in thread
From: Pekka Enberg @ 2008-03-18 18:54 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Mel Gorman, Matt Mackall, yanmin_zhang, linux-mm

Christoph Lameter wrote:
> Since there is now no worry anymore about higher order allocs (hopefully).
> Set the max order to default to PAGE_ALLOC_ORDER_COSTLY (32k) and require
> slub to use a higher order if a certain object density cannot be reached.
> 
> The mininum objects per slab is calculated based on the number of processors
> that may come online.

Interesting. Why do we want to make min objects depend on CPU count and 
not amount of memory available on the system?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.
  2008-03-18 18:54   ` [patch 7/9] slub: Adjust order boundaries and minimum objects per slab Pekka Enberg
@ 2008-03-18 19:00     ` Christoph Lameter
  2008-03-19  1:04       ` Zhang, Yanmin
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Lameter @ 2008-03-18 19:00 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Mel Gorman, Matt Mackall, yanmin_zhang, linux-mm

On Tue, 18 Mar 2008, Pekka Enberg wrote:

> Christoph Lameter wrote:
> > Since there is now no worry anymore about higher order allocs (hopefully).
> > Set the max order to default to PAGE_ALLOC_ORDER_COSTLY (32k) and require
> > slub to use a higher order if a certain object density cannot be reached.
> > 
> > The mininum objects per slab is calculated based on the number of processors
> > that may come online.
> 
> Interesting. Why do we want to make min objects depend on CPU count and not
> amount of memory available on the system?

Yanmin found a performance correlation with processors. He may be able to 
expand on that.
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.
  2008-03-18 19:00     ` Christoph Lameter
@ 2008-03-19  1:04       ` Zhang, Yanmin
  2008-03-19 15:20         ` Pekka Enberg
  0 siblings, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-19  1:04 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Tue, 2008-03-18 at 12:00 -0700, Christoph Lameter wrote:
> On Tue, 18 Mar 2008, Pekka Enberg wrote:
> 
> > Christoph Lameter wrote:
> > > Since there is now no worry anymore about higher order allocs (hopefully).
> > > Set the max order to default to PAGE_ALLOC_ORDER_COSTLY (32k) and require
> > > slub to use a higher order if a certain object density cannot be reached.
> > > 
> > > The mininum objects per slab is calculated based on the number of processors
> > > that may come online.
> > 
> > Interesting. Why do we want to make min objects depend on CPU count and not
> > amount of memory available on the system?
> 
> Yanmin found a performance correlation with processors. He may be able to 
> expand on that.
>From performance point of view, slab alloc/free competition among processes and processors
is one of the key bootlenecks. If a server has more processors, usually it means more
processes run on it, and the competition is more severe.

I did lots of testing with hackbench on my 8-core stoakley and 16-core tigerton. Pls. find
the data in discussion thread http://marc.info/?l=linux-kernel&m=120581108329033&w=2.

As you know, amount of memory is a direct factor to have impact on the min objects
obviously, but I think mostly it is from memory fragmentation or something else. I have testing
data to verify the correlation with processors, but have no data about amount of memory.

In the other hand, memory is very cheap now. Usually users could install lots of memory
in server. So the competition among processors/processes are more severe.

If both processor number and amount of memory are the input factor for min objects, I have
no objections but asking highlighting processer number. If not, I will like to choose processor
number.

-yanmin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 2/9] Store max number of objects in the page struct.
       [not found] ` <20080317230528.279983034@sgi.com>
@ 2008-03-19  9:09   ` Zhang, Yanmin
  2008-03-19 17:49     ` Christoph Lameter
  0 siblings, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-19  9:09 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Mon, 2008-03-17 at 16:05 -0700, Christoph Lameter wrote:
> plain text document attachment
> (0002-Store-number-of-objects-in-page-struct.patch)
> Split the inuse field up to be able to store the number of objects in this
> page in the page struct as well. Necessary if we want to have pages of
> various orders for a slab. Also avoids touching struct kmem_cache cachelines in
> __slab_alloc().
> 
> Update diagnostic code to check the number of objects and make sure that
> the number of objects always stays within the bounds of a 16 bit unsigned
> integer.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
> ---
>  include/linux/mm_types.h |    5 +++-
>  mm/slub.c                |   54 +++++++++++++++++++++++++++++------------------
>  2 files changed, 38 insertions(+), 21 deletions(-)
> 
> @@ -1487,7 +1498,7 @@ load_freelist:
>  		goto debug;
>  
>  	c->freelist = object[c->offset];
> -	c->page->inuse = s->objects;
> +	c->page->inuse = c->page->objects;
>  	c->page->freelist = NULL;
>  	c->node = page_to_nid(c->page);
>  unlock_out:
> @@ -1786,6 +1797,9 @@ static inline int slab_order(int size, i
>  	int rem;
>  	int min_order = slub_min_order;
>  
> +	if ((PAGE_SIZE << min_order) / size > 65535)
> +		return get_order(size * 65535) - 1;
Is it better to define something like USHORT_MAX to replace 65535?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.
  2008-03-19  1:04       ` Zhang, Yanmin
@ 2008-03-19 15:20         ` Pekka Enberg
  0 siblings, 0 replies; 23+ messages in thread
From: Pekka Enberg @ 2008-03-19 15:20 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Christoph Lameter, Mel Gorman, Matt Mackall, linux-mm

Hi Yanmin,

On Wed, Mar 19, 2008 at 3:04 AM, Zhang, Yanmin
<yanmin_zhang@linux.intel.com> wrote:
>  In the other hand, memory is very cheap now. Usually users could install lots of memory
>  in server. So the competition among processors/processes are more severe.

Sure, but don't forget we have embedded users as well.

On Wed, Mar 19, 2008 at 3:04 AM, Zhang, Yanmin
<yanmin_zhang@linux.intel.com> wrote:
>  If both processor number and amount of memory are the input factor for min objects, I have
>  no objections but asking highlighting processer number. If not, I will like to choose processor
>  number.

I'm ok with your current scheme as it works nicely with low-end
machines as well. I was just curious to hear how you came up with
that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 2/9] Store max number of objects in the page struct.
  2008-03-19  9:09   ` [patch 2/9] Store max number of objects in the page struct Zhang, Yanmin
@ 2008-03-19 17:49     ` Christoph Lameter
  2008-03-20  3:32       ` Zhang, Yanmin
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Lameter @ 2008-03-19 17:49 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Wed, 19 Mar 2008, Zhang, Yanmin wrote:

> > +	if ((PAGE_SIZE << min_order) / size > 65535)
> > +		return get_order(size * 65535) - 1;
> Is it better to define something like USHORT_MAX to replace 65535?

Yes. Do we have something like that?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 2/9] Store max number of objects in the page struct.
  2008-03-19 17:49     ` Christoph Lameter
@ 2008-03-20  3:32       ` Zhang, Yanmin
  2008-03-20 21:05         ` Christoph Lameter
  2008-03-21 22:24         ` Andrew Morton
  0 siblings, 2 replies; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-20  3:32 UTC (permalink / raw)
  To: Christoph Lameter, LKML; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Wed, 2008-03-19 at 10:49 -0700, Christoph Lameter wrote:
> On Wed, 19 Mar 2008, Zhang, Yanmin wrote:
> 
> > > +	if ((PAGE_SIZE << min_order) / size > 65535)
> > > +		return get_order(size * 65535) - 1;
> > Is it better to define something like USHORT_MAX to replace 65535?
> 
> Yes. Do we have something like that?

I couldn't find such definition in include/linux/kernel.h.


But glibc defines USHRT_MAX file include/limits.h:

/* Minimum and maximum values a `signed short int' can hold.  */
#  define SHRT_MIN      (-32768)
#  define SHRT_MAX      32767

/* Maximum value an `unsigned short int' can hold.  (Minimum is 0.)  */
#  define USHRT_MAX     65535


How about below patch against 2.6.25-rc6?

---

Add definitions of USHRT_MAX and others into kernel. ipc uses it and
slub implementation might also use it.

The patch is against 2.6.25-rc6.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.25-rc6/include/linux/kernel.h	2008-03-20 04:25:46.000000000 +0800
+++ linux-2.6.25-rc6_work/include/linux/kernel.h	2008-03-20 04:17:45.000000000 +0800
@@ -20,6 +20,9 @@
 extern const char linux_banner[];
 extern const char linux_proc_banner[];
 
+#define USHRT_MAX	((u16)(~0U))
+#define SHRT_MAX	((s16)(USHRT_MAX>>1))
+#define SHRT_MIN	(-SHRT_MAX - 1)
 #define INT_MAX		((int)(~0U>>1))
 #define INT_MIN		(-INT_MAX - 1)
 #define UINT_MAX	(~0U)
--- linux-2.6.25-rc6/ipc/util.h	2008-03-20 04:25:46.000000000 +0800
+++ linux-2.6.25-rc6_work/ipc/util.h	2008-03-20 04:22:07.000000000 +0800
@@ -12,7 +12,6 @@
 
 #include <linux/err.h>
 
-#define USHRT_MAX 0xffff
 #define SEQ_MULTIPLIER	(IPCMNI)
 
 void sem_init (void);






--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
       [not found] ` <20080317230528.939792410@sgi.com>
@ 2008-03-20  5:10   ` Zhang, Yanmin
  2008-03-20 18:29     ` Christoph Lameter
  0 siblings, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-20  5:10 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Mon, 2008-03-17 at 16:05 -0700, Christoph Lameter wrote:
> plain text document attachment
> (0005-slub-Fallback-to-minimal-order-during-slab-page-all.patch)
> If any higher order allocation fails then fall back the smallest order
> necessary to contain at least one object.
> 
> Add a new field min_objects that will contain the objects for the smallest
> possible order of an allocation.
> 
> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
> ---
>  include/linux/slub_def.h |    2 +
>  mm/slub.c                |   49 +++++++++++++++++++++++++++++++++--------------
>  2 files changed, 37 insertions(+), 14 deletions(-)
> 
> Index: linux-2.6/include/linux/slub_def.h
> ===================================================================
> --- linux-2.6.orig/include/linux/slub_def.h	2008-03-17 15:32:07.605564060 -0700
> +++ linux-2.6/include/linux/slub_def.h	2008-03-17 15:33:05.718268322 -0700
> @@ -29,6 +29,7 @@ enum stat_item {
>  	DEACTIVATE_TO_HEAD,	/* Cpu slab was moved to the head of partials */
>  	DEACTIVATE_TO_TAIL,	/* Cpu slab was moved to the tail of partials */
>  	DEACTIVATE_REMOTE_FREES,/* Slab contained remotely freed objects */
> +	ORDER_FALLBACK,		/* Number of times fallback was necessary */
>  	NR_SLUB_STAT_ITEMS };
>  
>  struct kmem_cache_cpu {
> @@ -73,6 +74,7 @@ struct kmem_cache {
>  	/* Allocation and freeing of slabs */
>  	int max_objects;	/* Number of objects in a slab of maximum size */
>  	int objects;		/* Number of objects in a slab of current size */
> +	int min_objects;	/* Number of objects in a slab of mininum size */
Will min_objects be exported by sysfs? If not, I would like not to add the member.
In stead, just change function allocate_slab:

page->objects = (PAGE_SIZE << get_order(s->size)) / s->size;


It'll look more readable and be simplified.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 8/9] slub: Make the order configurable for each slab cache
       [not found] ` <20080317230529.701336582@sgi.com>
@ 2008-03-20  5:53   ` Zhang, Yanmin
  2008-03-20 18:31     ` Christoph Lameter
  0 siblings, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-20  5:53 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Mon, 2008-03-17 at 16:05 -0700, Christoph Lameter wrote:
> plain text document attachment
> (0008-slub-Make-the-order-configurable-for-each-slab-cach.patch)
> Makes /sys/kernel/slab/<slabname>/order writable. The allocation
> order of a slab cache can then be changed dynamically during runtime.
> This can be used to override the objects per slabs value establisheed
> with the slub_min_objects setting that was manually specified or
> calculated on bootup.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
> ---
>  mm/slub.c |   30 +++++++++++++++++++++++-------
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2008-03-17 15:38:16.337702541 -0700
> +++ linux-2.6/mm/slub.c	2008-03-17 15:49:47.791302447 -0700
> @@ -2146,7 +2146,7 @@ static int init_kmem_cache_nodes(struct 
>   * calculate_sizes() determines the order and the distribution of data within
>   * a slab object.
>   */
> -static int calculate_sizes(struct kmem_cache *s)
> +static int calculate_sizes(struct kmem_cache *s, int forced_order)
Is there any race between calculate_sizes and allocate_slab?
calculate_sizes sets s->order and s->objects, while allocate_slab uses them.

For example, change order from 5 to 2.

Step\thread	|	Thread 1		|	Thread 2
--------------------------------------------------------------------------------------------
1		|	allocate_slab           |
		|	fetch the old s->order  |
--------------------------------------------------------------------------------------------
2		|				|	calculate_sizes
		|				|	changes s->order=2
--------------------------------------------------------------------------------------------
3		|				|	calculate_sizes
		|				|	changes s->objects=8
--------------------------------------------------------------------------------------------
4		|	allocate_slab           |
		|	fetchs s->objects to	|
		|	page->objects;          |

Just before calculate_sizes changes s->order to a smaller value,
allocate_slab might fetch the old s->order to call alloc_pages successfully. Then, before
allocate_slab fetch s->objects, calculate_sizes changes it to a smaller value

It could be resolved by fetch s->order in allocate_slab firstly and calculate
page->objects lately instead of fetching s->objects.

>  {
>  	unsigned long flags = s->flags;
>  	unsigned long size = s->objsize;
> @@ -2235,7 +2235,11 @@ static int calculate_sizes(struct kmem_c
>  	size = ALIGN(size, align);
>  	s->size = size;
>  
> -	s->order = calculate_order(size);
> +	if (forced_order >= 0)
> +		s->order = forced_order;
> +	else
> +		s->order = calculate_order(size);
> +
>  	if (s->order < 0)
>  		return 0;
>  
> @@ -2271,7 +2275,7 @@ static int kmem_cache_open(struct kmem_c
>  	s->align = align;
>  	s->flags = kmem_cache_flags(size, flags, name, ctor);
>  
> -	if (!calculate_sizes(s))
> +	if (!calculate_sizes(s, -1))
>  		goto error;
>  
>  	s->refcount = 1;
> @@ -3727,11 +3731,23 @@ static ssize_t objs_per_slab_show(struct
>  }
>  SLAB_ATTR_RO(objs_per_slab);
>  
> +static ssize_t order_store(struct kmem_cache *s,
> +				const char *buf, size_t length)
> +{
> +	int order = simple_strtoul(buf, NULL, 10);
> +
> +	if (order > slub_max_order || order < slub_min_order)
> +		return -EINVAL;
> +
> +	calculate_sizes(s, order);
> +	return length;
> +}
> +
>  static ssize_t order_show(struct kmem_cache *s, char *buf)
>  {
>  	return sprintf(buf, "%d\n", s->order);
>  }
> -SLAB_ATTR_RO(order);
> +SLAB_ATTR(order);
>  
>  static ssize_t ctor_show(struct kmem_cache *s, char *buf)
>  {
> @@ -3865,7 +3881,7 @@ static ssize_t red_zone_store(struct kme
>  	s->flags &= ~SLAB_RED_ZONE;
>  	if (buf[0] == '1')
>  		s->flags |= SLAB_RED_ZONE;
> -	calculate_sizes(s);
> +	calculate_sizes(s, -1);
>  	return length;
>  }
>  SLAB_ATTR(red_zone);
> @@ -3884,7 +3900,7 @@ static ssize_t poison_store(struct kmem_
>  	s->flags &= ~SLAB_POISON;
>  	if (buf[0] == '1')
>  		s->flags |= SLAB_POISON;
> -	calculate_sizes(s);
> +	calculate_sizes(s, -1);
>  	return length;
>  }
>  SLAB_ATTR(poison);
> @@ -3903,7 +3919,7 @@ static ssize_t store_user_store(struct k
>  	s->flags &= ~SLAB_STORE_USER;
>  	if (buf[0] == '1')
>  		s->flags |= SLAB_STORE_USER;
> -	calculate_sizes(s);
> +	calculate_sizes(s, -1);
>  	return length;
>  }
>  SLAB_ATTR(store_user);
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.
       [not found] ` <20080317230529.474353536@sgi.com>
  2008-03-18 18:54   ` [patch 7/9] slub: Adjust order boundaries and minimum objects per slab Pekka Enberg
@ 2008-03-20  6:44   ` Zhang, Yanmin
  2008-03-20 18:32     ` Christoph Lameter
  1 sibling, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-20  6:44 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Mon, 2008-03-17 at 16:05 -0700, Christoph Lameter wrote:
> plain text document attachment
> (0007-slub-Adjust-order-boundaries-and-minimum-objects-pe.patch)
> Since there is now no worry anymore about higher order allocs (hopefully).
> Set the max order to default to PAGE_ALLOC_ORDER_COSTLY (32k) and require
> slub to use a higher order if a certain object density cannot be reached.
> 
> The mininum objects per slab is calculated based on the number of processors
> that may come online.
> 
> Processors	min_objects
> ---------------------------
> 1		4
> 2		8
> 4		12
> 8		16
> 16		20
> 32		24
> 64		28
> 1024		44
> 4096		52
All min_objects's real values are 4 more than above values, as fls(16)
is equal to 5. So on 16-core tigerton, min_objects=24 which is between 16 and 32.

I applied the patches to 2.6.26-rc6 and tested it on 16-core tigerton and 8-core
stoakley. The result is between the ones of min_objects=16 and min_objects=32 on tigerton.
On stoakley, the result distance is even smaller. I'm ok with the result.

Thanks,
yanmin

> 
> V1->V2
> - Add logic to compute min_objects based on processor count.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
> ---
>  mm/slub.c |   29 +++++++----------------------
>  1 file changed, 7 insertions(+), 22 deletions(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2008-03-17 15:33:42.118699647 -0700
> +++ linux-2.6/mm/slub.c	2008-03-17 15:49:57.459504642 -0700
> @@ -5,7 +5,7 @@
>   * The allocator synchronizes using per slab locks and only
>   * uses a centralized lock to manage a pool of partial slabs.
>   *
> - * (C) 2007 SGI, Christoph Lameter <clameter@sgi.com>
> + * (C) 2007, 2008 SGI, Christoph Lameter <clameter@sgi.com>
>   */
>  
>  #include <linux/mm.h>
> @@ -149,24 +149,7 @@ static inline void ClearSlabDebug(struct
>  /* Enable to test recovery from slab corruption on boot */
>  #undef SLUB_RESILIENCY_TEST
>  
> -#if PAGE_SHIFT <= 12
> -
> -/*
> - * Small page size. Make sure that we do not fragment memory
> - */
> -#define DEFAULT_MAX_ORDER 1
> -#define DEFAULT_MIN_OBJECTS 4
> -
> -#else
> -
> -/*
> - * Large page machines are customarily able to handle larger
> - * page orders.
> - */
> -#define DEFAULT_MAX_ORDER 2
> -#define DEFAULT_MIN_OBJECTS 8
> -
> -#endif
> +#define DEFAULT_MAX_ORDER PAGE_ALLOC_COSTLY_ORDER
>  
>  /*
>   * Mininum number of partial slabs. These will be left on the partial
> @@ -1768,7 +1751,7 @@ static struct page *get_object_page(cons
>   */
>  static int slub_min_order;
>  static int slub_max_order = DEFAULT_MAX_ORDER;
> -static int slub_min_objects = DEFAULT_MIN_OBJECTS;
> +static int slub_min_objects = -1;
>  
>  /*
>   * Merge control. If this is set then no merging of slab caches will occur.
> @@ -1845,9 +1828,11 @@ static inline int calculate_order(int si
>  	 * we reduce the minimum objects required in a slab.
>  	 */
>  	min_objects = slub_min_objects;
> +	if (min_objects <= 0)
> +		min_objects = 4 * (fls(nr_cpu_ids) + 1);
>  	while (min_objects > 1) {
> -		fraction = 8;
> -		while (fraction >= 4) {
> +		fraction = 16;
> +		while (fraction >= 8) {
>  			order = slab_order(size, min_objects,
>  						slub_max_order, fraction);
>  			if (order <= slub_max_order)
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
  2008-03-20  5:10   ` [patch 5/9] slub: Fallback to minimal order during slab page allocation Zhang, Yanmin
@ 2008-03-20 18:29     ` Christoph Lameter
  2008-03-21  0:52       ` Zhang, Yanmin
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Lameter @ 2008-03-20 18:29 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 20 Mar 2008, Zhang, Yanmin wrote:

> page->objects = (PAGE_SIZE << get_order(s->size)) / s->size;
> 
> 
> It'll look more readable and be simplified.

The earlier version of the fallback had that.

However, its a division in a potentially hot codepath.
And some architectures have slow division logic.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 8/9] slub: Make the order configurable for each slab cache
  2008-03-20  5:53   ` [patch 8/9] slub: Make the order configurable for each slab cache Zhang, Yanmin
@ 2008-03-20 18:31     ` Christoph Lameter
  2008-03-20 23:57       ` Christoph Lameter
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Lameter @ 2008-03-20 18:31 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 20 Mar 2008, Zhang, Yanmin wrote:

> It could be resolved by fetch s->order in allocate_slab firstly and calculate
> page->objects lately instead of fetching s->objects.

Hmmm..... Indeed a race there. But I want to avoid divisions in that code 
path.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 7/9] slub: Adjust order boundaries and minimum objects per slab.
  2008-03-20  6:44   ` Zhang, Yanmin
@ 2008-03-20 18:32     ` Christoph Lameter
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Lameter @ 2008-03-20 18:32 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 20 Mar 2008, Zhang, Yanmin wrote:

> All min_objects's real values are 4 more than above values, as fls(16)
> is equal to 5. So on 16-core tigerton, min_objects=24 which is between 16 and 32.

Hmmm... Okay that may be fine. Just need to update the docs.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 2/9] Store max number of objects in the page struct.
  2008-03-20  3:32       ` Zhang, Yanmin
@ 2008-03-20 21:05         ` Christoph Lameter
  2008-03-21 22:24         ` Andrew Morton
  1 sibling, 0 replies; 23+ messages in thread
From: Christoph Lameter @ 2008-03-20 21:05 UTC (permalink / raw)
  To: akpm, yanmin_zhang; +Cc: LKML, Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 20 Mar 2008, Zhang, Yanmin wrote:

> On Wed, 2008-03-19 at 10:49 -0700, Christoph Lameter wrote:
> > On Wed, 19 Mar 2008, Zhang, Yanmin wrote:
> > 
> > > > +	if ((PAGE_SIZE << min_order) / size > 65535)
> > > > +		return get_order(size * 65535) - 1;
> > > Is it better to define something like USHORT_MAX to replace 65535?
> > 
> > Yes. Do we have something like that?
> 
> I couldn't find such definition in include/linux/kernel.h.
> 
> 
> But glibc defines USHRT_MAX file include/limits.h:
> 
> /* Minimum and maximum values a `signed short int' can hold.  */
> #  define SHRT_MIN      (-32768)
> #  define SHRT_MAX      32767
> 
> /* Maximum value an `unsigned short int' can hold.  (Minimum is 0.)  */
> #  define USHRT_MAX     65535
> 
> 
> How about below patch against 2.6.25-rc6?
> 
> ---
> 
> Add definitions of USHRT_MAX and others into kernel. ipc uses it and
> slub implementation might also use it.
> 
> The patch is against 2.6.25-rc6.
> 
> Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
> 
> ---
> 
> --- linux-2.6.25-rc6/include/linux/kernel.h	2008-03-20 04:25:46.000000000 +0800
> +++ linux-2.6.25-rc6_work/include/linux/kernel.h	2008-03-20 04:17:45.000000000 +0800
> @@ -20,6 +20,9 @@
>  extern const char linux_banner[];
>  extern const char linux_proc_banner[];
>  
> +#define USHRT_MAX	((u16)(~0U))
> +#define SHRT_MAX	((s16)(USHRT_MAX>>1))
> +#define SHRT_MIN	(-SHRT_MAX - 1)
>  #define INT_MAX		((int)(~0U>>1))
>  #define INT_MIN		(-INT_MAX - 1)
>  #define UINT_MAX	(~0U)
> --- linux-2.6.25-rc6/ipc/util.h	2008-03-20 04:25:46.000000000 +0800
> +++ linux-2.6.25-rc6_work/ipc/util.h	2008-03-20 04:22:07.000000000 +0800
> @@ -12,7 +12,6 @@
>  
>  #include <linux/err.h>
>  
> -#define USHRT_MAX 0xffff
>  #define SEQ_MULTIPLIER	(IPCMNI)
>  
>  void sem_init (void);
> 
> 
> 
> 
> 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 8/9] slub: Make the order configurable for each slab cache
  2008-03-20 18:31     ` Christoph Lameter
@ 2008-03-20 23:57       ` Christoph Lameter
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Lameter @ 2008-03-20 23:57 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

Here is a potential fix:

Subject: Fix race between allocate_slab and calculate sizes through word accesses

Fix the race by fetching the order and the number of objects in one word.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/slub_def.h |   16 ++++++--
 mm/slub.c                |   87 +++++++++++++++++++++++++++++------------------
 2 files changed, 67 insertions(+), 36 deletions(-)

Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2008-03-20 16:13:51.843215895 -0700
+++ linux-2.6/include/linux/slub_def.h	2008-03-20 16:37:36.277897003 -0700
@@ -55,6 +55,15 @@ struct kmem_cache_node {
 };
 
 /*
+ * Word size structure that can be atomically updated or read and that
+ * contains both the order and the number of objects that a slab of the
+ * given order would contain.
+ */
+struct kmem_cache_order_objects {
+	unsigned long x;
+};
+
+/*
  * Slab cache management.
  */
 struct kmem_cache {
@@ -63,7 +72,7 @@ struct kmem_cache {
 	int size;		/* The size of an object including meta data */
 	int objsize;		/* The size of an object without meta data */
 	int offset;		/* Free pointer offset. */
-	int order;		/* Current preferred allocation order */
+	struct kmem_cache_order_objects active;
 
 	/*
 	 * Avoid an extra cache line for UP, SMP and for the node local to
@@ -72,9 +81,8 @@ struct kmem_cache {
 	struct kmem_cache_node local_node;
 
 	/* Allocation and freeing of slabs */
-	int max_objects;	/* Number of objects in a slab of maximum size */
-	int objects;		/* Number of objects in a slab of current size */
-	int min_objects;	/* Number of objects in a slab of mininum size */
+	struct kmem_cache_order_objects max;
+	struct kmem_cache_order_objects min;
 	gfp_t allocflags;	/* gfp flags to use on each alloc */
 	int refcount;		/* Refcount for slab cache destroy */
 	void (*ctor)(struct kmem_cache *, void *);
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2008-03-20 16:13:51.855215725 -0700
+++ linux-2.6/mm/slub.c	2008-03-20 16:39:08.283420656 -0700
@@ -322,6 +322,25 @@ static inline int slab_index(void *p, st
 	return (p - addr) / s->size;
 }
 
+static inline struct kmem_cache_order_objects set_korder(int order,
+						unsigned long size)
+{
+	struct kmem_cache_order_objects x =
+			{ (order << 16) + (PAGE_SIZE << order) / size };
+
+	return x;
+}
+
+static inline int get_korder(struct kmem_cache_order_objects x)
+{
+	return x.x >> 16;
+}
+
+static inline int get_kobjects(struct kmem_cache_order_objects x)
+{
+	return x.x & ((1 << 16) - 1);
+}
+
 #ifdef CONFIG_SLUB_DEBUG
 /*
  * Debug settings:
@@ -1032,8 +1051,11 @@ static inline unsigned long kmem_cache_f
 #define slub_debug 0
 #endif
 
-static inline struct page *alloc_slab_page(gfp_t flags, int node, int order)
+static inline struct page *alloc_slab_page(gfp_t flags, int node,
+			struct kmem_cache_order_objects oo)
 {
+	int order = get_korder(oo);
+
 	if (node == -1)
 		return alloc_pages(flags, order);
 	else
@@ -1046,32 +1068,30 @@ static inline struct page *alloc_slab_pa
 static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
 	struct page *page;
-	int pages = 1 << s->order;
+	struct kmem_cache_order_objects oo = s->active;
 
 	flags |= s->allocflags;
 
-	page = alloc_slab_page(flags | __GFP_NOWARN | __GFP_NORETRY,
-								node, s->order);
+	page = alloc_slab_page(flags | __GFP_NOWARN | __GFP_NORETRY, node, oo);
 	if (unlikely(!page)) {
 		/*
 		 * Allocation may have failed due to fragmentation.
 		 * Try a lower order alloc if possible
 		 */
-		page = alloc_slab_page(flags, node, get_order(s->size));
-		if (page) {
-			pages = 1 << compound_order(page);
-			stat(get_cpu_slab(s, raw_smp_processor_id()),
-							ORDER_FALLBACK);
-			page->objects = s->min_objects;
-		} else
+		oo = s->min;
+		page = alloc_slab_page(flags, node, oo);
+		if (!page)
 			return NULL;
-	} else
-		page->objects = s->objects;
+
+		stat(get_cpu_slab(s, raw_smp_processor_id()),
+							ORDER_FALLBACK);
+	}
+	page->objects = get_kobjects(oo);
 
 	mod_zone_page_state(page_zone(page),
 		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
 		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
-		pages);
+		1 << get_korder(oo));
 
 	return page;
 }
@@ -2151,6 +2171,7 @@ static int calculate_sizes(struct kmem_c
 	unsigned long flags = s->flags;
 	unsigned long size = s->objsize;
 	unsigned long align = s->align;
+	int order;
 
 	/*
 	 * Round up object size to the next word boundary. We can only
@@ -2236,15 +2257,15 @@ static int calculate_sizes(struct kmem_c
 	s->size = size;
 
 	if (forced_order >= 0)
-		s->order = forced_order;
+		order = forced_order;
 	else
-		s->order = calculate_order(size);
+		order = calculate_order(size);
 
-	if (s->order < 0)
+	if (order < 0)
 		return 0;
 
 	s->allocflags = 0;
-	if (s->order)
+	if (order)
 		s->allocflags |= __GFP_COMP;
 
 	if (s->flags & SLAB_CACHE_DMA)
@@ -2256,11 +2277,12 @@ static int calculate_sizes(struct kmem_c
 	/*
 	 * Determine the number of objects per slab
 	 */
-	s->objects = (PAGE_SIZE << s->order) / size;
-	s->min_objects = (PAGE_SIZE << get_order(size)) / size;
-	if (s->objects > s->max_objects)
-		s->max_objects = s->objects;
-	return !!s->objects;
+	s->active = set_korder(order, size);
+	s->min = set_korder(get_order(size), size);
+
+	if (get_kobjects(s->active) > get_kobjects(s->max))
+		s->max = s->active;
+	return !!get_kobjects(s->active);
 }
 
 static int kmem_cache_open(struct kmem_cache *s, gfp_t gfpflags,
@@ -2292,7 +2314,7 @@ error:
 	if (flags & SLAB_PANIC)
 		panic("Cannot create slab %s size=%lu realsize=%u "
 			"order=%u offset=%u flags=%lx\n",
-			s->name, (unsigned long)size, s->size, s->order,
+			s->name, (unsigned long)size, s->size, get_korder(s->active),
 			s->offset, flags);
 	return 0;
 }
@@ -2734,7 +2756,7 @@ int kmem_cache_shrink(struct kmem_cache 
 	struct page *page;
 	struct page *t;
 	struct list_head *slabs_by_inuse =
-		kmalloc(sizeof(struct list_head) * s->max_objects, GFP_KERNEL);
+		kmalloc(sizeof(struct list_head) * get_kobjects(s->max), GFP_KERNEL);
 	unsigned long flags;
 
 	if (!slabs_by_inuse)
@@ -2747,7 +2769,7 @@ int kmem_cache_shrink(struct kmem_cache 
 		if (!n->nr_partial)
 			continue;
 
-		for (i = 0; i < s->max_objects; i++)
+		for (i = 0; i < get_kobjects(s->max); i++)
 			INIT_LIST_HEAD(slabs_by_inuse + i);
 
 		spin_lock_irqsave(&n->list_lock, flags);
@@ -2779,7 +2801,7 @@ int kmem_cache_shrink(struct kmem_cache 
 		 * Rebuild the partial list with the slabs filled up most
 		 * first and the least used slabs at the end.
 		 */
-		for (i = s->max_objects - 1; i >= 0; i--)
+		for (i = get_kobjects(s->max) - 1; i >= 0; i--)
 			list_splice(slabs_by_inuse + i, n->partial.prev);
 
 		spin_unlock_irqrestore(&n->list_lock, flags);
@@ -3277,8 +3299,8 @@ static long validate_slab_cache(struct k
 {
 	int node;
 	unsigned long count = 0;
-	unsigned long *map = kmalloc(BITS_TO_LONGS(s->max_objects) *
-				sizeof(unsigned long), GFP_KERNEL);
+	unsigned long *map = kmalloc(BITS_TO_LONGS(get_kobjects(s->max)) *
+			sizeof(unsigned long), GFP_KERNEL);
 
 	if (!map)
 		return -ENOMEM;
@@ -3727,7 +3749,7 @@ SLAB_ATTR_RO(object_size);
 
 static ssize_t objs_per_slab_show(struct kmem_cache *s, char *buf)
 {
-	return sprintf(buf, "%d\n", s->objects);
+	return sprintf(buf, "%d\n", get_kobjects(s->active));
 }
 SLAB_ATTR_RO(objs_per_slab);
 
@@ -3745,7 +3767,7 @@ static ssize_t order_store(struct kmem_c
 
 static ssize_t order_show(struct kmem_cache *s, char *buf)
 {
-	return sprintf(buf, "%d\n", s->order);
+	return sprintf(buf, "%d\n", get_korder(s->active));
 }
 SLAB_ATTR(order);
 
@@ -4410,7 +4432,8 @@ static int s_show(struct seq_file *m, vo
 	nr_inuse = nr_objs - nr_free;
 
 	seq_printf(m, "%-17s %6lu %6lu %6u %4u %4d", s->name, nr_inuse,
-		   nr_objs, s->size, s->objects, (1 << s->order));
+		   nr_objs, s->size, get_kobjects(s->active),
+		   (1 << get_korder(s->active)));
 	seq_printf(m, " : tunables %4u %4u %4u", 0, 0, 0);
 	seq_printf(m, " : slabdata %6lu %6lu %6lu", nr_slabs, nr_slabs,
 		   0UL);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
  2008-03-20 18:29     ` Christoph Lameter
@ 2008-03-21  0:52       ` Zhang, Yanmin
  2008-03-21  3:35         ` Christoph Lameter
  0 siblings, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-21  0:52 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 2008-03-20 at 11:29 -0700, Christoph Lameter wrote:
> On Thu, 20 Mar 2008, Zhang, Yanmin wrote:
> 
> > page->objects = (PAGE_SIZE << get_order(s->size)) / s->size;
> > 
> > 
> > It'll look more readable and be simplified.
> 
> The earlier version of the fallback had that.
> 
> However, its a division in a potentially hot codepath.
No as long as there is no allocation failure because of fragmentation.

> And some architectures have slow division logic.
That's ok. I just want it to be more readable for others.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
  2008-03-21  0:52       ` Zhang, Yanmin
@ 2008-03-21  3:35         ` Christoph Lameter
  2008-03-21  5:14           ` Zhang, Yanmin
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Lameter @ 2008-03-21  3:35 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Fri, 21 Mar 2008, Zhang, Yanmin wrote:

> > However, its a division in a potentially hot codepath.
> No as long as there is no allocation failure because of fragmentation.

If its only used for the fallback path then the race condition is still 
there?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
  2008-03-21  3:35         ` Christoph Lameter
@ 2008-03-21  5:14           ` Zhang, Yanmin
  2008-03-21  6:07             ` Christoph Lameter
  0 siblings, 1 reply; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-21  5:14 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 2008-03-20 at 20:35 -0700, Christoph Lameter wrote:
> On Fri, 21 Mar 2008, Zhang, Yanmin wrote:
> 
> > > However, its a division in a potentially hot codepath.
> > No as long as there is no allocation failure because of fragmentation.
> 
> If its only used for the fallback path then the race condition is still 
> there?
I can't understand your question. Does it means min_objects? It's not related
to the race. The fallback path also isn't related to the race.

The race is when kernel runs in allocate_slab, just between fetching s->order and
s->objects,user might change s->order by sysfs.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
  2008-03-21  5:14           ` Zhang, Yanmin
@ 2008-03-21  6:07             ` Christoph Lameter
  2008-03-21  8:23               ` Zhang, Yanmin
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Lameter @ 2008-03-21  6:07 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Fri, 21 Mar 2008, Zhang, Yanmin wrote:

> On Thu, 2008-03-20 at 20:35 -0700, Christoph Lameter wrote:
> > On Fri, 21 Mar 2008, Zhang, Yanmin wrote:
> > 
> > > > However, its a division in a potentially hot codepath.
> > > No as long as there is no allocation failure because of fragmentation.
> > 
> > If its only used for the fallback path then the race condition is still 
> > there?
> I can't understand your question. Does it means min_objects? It's not related
> to the race. The fallback path also isn't related to the race.
> 
> The race is when kernel runs in allocate_slab, just between fetching s->order and
> s->objects,user might change s->order by sysfs.

Right. That patch matters most and with the patch that I posted a few 
hours ago there is a common scheme that addresses both the race and the 
issue with min_objects (hopefully...).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 5/9] slub: Fallback to minimal order during slab page allocation
  2008-03-21  6:07             ` Christoph Lameter
@ 2008-03-21  8:23               ` Zhang, Yanmin
  0 siblings, 0 replies; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-21  8:23 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Mel Gorman, Matt Mackall, linux-mm

On Thu, 2008-03-20 at 23:07 -0700, Christoph Lameter wrote:
> On Fri, 21 Mar 2008, Zhang, Yanmin wrote:
> 
> > On Thu, 2008-03-20 at 20:35 -0700, Christoph Lameter wrote:
> > > On Fri, 21 Mar 2008, Zhang, Yanmin wrote:
> > > 
> > > > > However, its a division in a potentially hot codepath.
> > > > No as long as there is no allocation failure because of fragmentation.
> > > 
> > > If its only used for the fallback path then the race condition is still 
> > > there?
> > I can't understand your question. Does it means min_objects? It's not related
> > to the race. The fallback path also isn't related to the race.
> > 
> > The race is when kernel runs in allocate_slab, just between fetching s->order and
> > s->objects,user might change s->order by sysfs.
> 
> Right. That patch matters most and with the patch that I posted a few 
> hours ago there is a common scheme that addresses both the race and the 
> issue with min_objects (hopefully...)
Yes, the patch does address them.

Thanks,
Yanmin



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 2/9] Store max number of objects in the page struct.
  2008-03-20  3:32       ` Zhang, Yanmin
  2008-03-20 21:05         ` Christoph Lameter
@ 2008-03-21 22:24         ` Andrew Morton
  2008-03-22  3:27           ` Ben Pfaff
  2008-03-24  1:22           ` [PATCH] Add definitions of USHORT_MAX and others Zhang, Yanmin
  1 sibling, 2 replies; 23+ messages in thread
From: Andrew Morton @ 2008-03-21 22:24 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: clameter, linux-kernel, penberg, mel, mpm, linux-mm

On Thu, 20 Mar 2008 11:32:17 +0800
"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> wrote:

> Add definitions of USHRT_MAX and others into kernel. ipc uses it and
> slub implementation might also use it.
> 
> The patch is against 2.6.25-rc6.
> 
> Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
> 
> ---
> 
> --- linux-2.6.25-rc6/include/linux/kernel.h	2008-03-20 04:25:46.000000000 +0800
> +++ linux-2.6.25-rc6_work/include/linux/kernel.h	2008-03-20 04:17:45.000000000 +0800
> @@ -20,6 +20,9 @@
>  extern const char linux_banner[];
>  extern const char linux_proc_banner[];
>  
> +#define USHRT_MAX	((u16)(~0U))
> +#define SHRT_MAX	((s16)(USHRT_MAX>>1))
> +#define SHRT_MIN	(-SHRT_MAX - 1)

We have UINT_MAX and ULONG_MAX and ULLONG_MAX.  If these were actually
UNT_MAX, ULNG_MAX and ULLNG_MAX then USHRT_MAX would make sense.

But they aren't, so it doesn't ;)

Please, let's call them USHORT_MAX, SHORT_MAX and SHORT_MIN.

> --- linux-2.6.25-rc6/ipc/util.h	2008-03-20 04:25:46.000000000 +0800
> +++ linux-2.6.25-rc6_work/ipc/util.h	2008-03-20 04:22:07.000000000 +0800
> @@ -12,7 +12,6 @@
>  
>  #include <linux/err.h>
>  
> -#define USHRT_MAX 0xffff
>  #define SEQ_MULTIPLIER	(IPCMNI)
>  
>  void sem_init (void);

And then convert IPC to use them?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 2/9] Store max number of objects in the page struct.
  2008-03-21 22:24         ` Andrew Morton
@ 2008-03-22  3:27           ` Ben Pfaff
  2008-03-24  1:22           ` [PATCH] Add definitions of USHORT_MAX and others Zhang, Yanmin
  1 sibling, 0 replies; 23+ messages in thread
From: Ben Pfaff @ 2008-03-22  3:27 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel

Andrew Morton <akpm@linux-foundation.org> writes:

>> +#define USHRT_MAX	((u16)(~0U))
>> +#define SHRT_MAX	((s16)(USHRT_MAX>>1))
>> +#define SHRT_MIN	(-SHRT_MAX - 1)
>
> We have UINT_MAX and ULONG_MAX and ULLONG_MAX.  If these were actually
> UNT_MAX, ULNG_MAX and ULLNG_MAX then USHRT_MAX would make sense.
>
> But they aren't, so it doesn't ;)
>
> Please, let's call them USHORT_MAX, SHORT_MAX and SHORT_MIN.

SHRT_MIN, SHRT_MAX, and USHRT_MAX are the spellings used by
<limits.h> required in ISO-conforming C implementations.  That
doesn't mean that the kernel has to use those spellings, but it
does mean that those names are widely understood by C
programmers.
-- 
Ben Pfaff 
http://benpfaff.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] Add definitions of USHORT_MAX and others
  2008-03-21 22:24         ` Andrew Morton
  2008-03-22  3:27           ` Ben Pfaff
@ 2008-03-24  1:22           ` Zhang, Yanmin
  1 sibling, 0 replies; 23+ messages in thread
From: Zhang, Yanmin @ 2008-03-24  1:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: clameter, linux-kernel, penberg, mel, mpm, linux-mm

Below is the new patch. Thanks for the comments.

---

Add definitions of USHORT_MAX and others into kernel. ipc uses it and
slub implementation might also use it.

The patch is against 2.6.25-rc6.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>

---

diff -Nraup linux-2.6.25-rc6/include/linux/kernel.h linux-2.6.25-rc6_maxshort/include/linux/kernel.h
--- linux-2.6.25-rc6/include/linux/kernel.h	2008-03-24 02:05:25.000000000 +0800
+++ linux-2.6.25-rc6_maxshort/include/linux/kernel.h	2008-03-24 02:07:27.000000000 +0800
@@ -20,6 +20,9 @@
 extern const char linux_banner[];
 extern const char linux_proc_banner[];
 
+#define USHORT_MAX	((u16)(~0U))
+#define SHORT_MAX	((s16)(USHORT_MAX>>1))
+#define SHORT_MIN	(-SHORT_MAX - 1)
 #define INT_MAX		((int)(~0U>>1))
 #define INT_MIN		(-INT_MAX - 1)
 #define UINT_MAX	(~0U)
diff -Nraup linux-2.6.25-rc6/ipc/msg.c linux-2.6.25-rc6_maxshort/ipc/msg.c
--- linux-2.6.25-rc6/ipc/msg.c	2008-03-24 02:05:25.000000000 +0800
+++ linux-2.6.25-rc6_maxshort/ipc/msg.c	2008-03-24 02:07:27.000000000 +0800
@@ -324,19 +324,19 @@ copy_msqid_to_user(void __user *buf, str
 		out.msg_rtime		= in->msg_rtime;
 		out.msg_ctime		= in->msg_ctime;
 
-		if (in->msg_cbytes > USHRT_MAX)
-			out.msg_cbytes	= USHRT_MAX;
+		if (in->msg_cbytes > USHORT_MAX)
+			out.msg_cbytes	= USHORT_MAX;
 		else
 			out.msg_cbytes	= in->msg_cbytes;
 		out.msg_lcbytes		= in->msg_cbytes;
 
-		if (in->msg_qnum > USHRT_MAX)
-			out.msg_qnum	= USHRT_MAX;
+		if (in->msg_qnum > USHORT_MAX)
+			out.msg_qnum	= USHORT_MAX;
 		else
 			out.msg_qnum	= in->msg_qnum;
 
-		if (in->msg_qbytes > USHRT_MAX)
-			out.msg_qbytes	= USHRT_MAX;
+		if (in->msg_qbytes > USHORT_MAX)
+			out.msg_qbytes	= USHORT_MAX;
 		else
 			out.msg_qbytes	= in->msg_qbytes;
 		out.msg_lqbytes		= in->msg_qbytes;
diff -Nraup linux-2.6.25-rc6/ipc/util.c linux-2.6.25-rc6_maxshort/ipc/util.c
--- linux-2.6.25-rc6/ipc/util.c	2008-03-24 02:05:25.000000000 +0800
+++ linux-2.6.25-rc6_maxshort/ipc/util.c	2008-03-24 02:07:27.000000000 +0800
@@ -84,8 +84,8 @@ void ipc_init_ids(struct ipc_ids *ids)
 	ids->seq = 0;
 	{
 		int seq_limit = INT_MAX/SEQ_MULTIPLIER;
-		if(seq_limit > USHRT_MAX)
-			ids->seq_max = USHRT_MAX;
+		if(seq_limit > USHORT_MAX)
+			ids->seq_max = USHORT_MAX;
 		 else
 		 	ids->seq_max = seq_limit;
 	}
diff -Nraup linux-2.6.25-rc6/ipc/util.h linux-2.6.25-rc6_maxshort/ipc/util.h
--- linux-2.6.25-rc6/ipc/util.h	2008-03-24 02:05:25.000000000 +0800
+++ linux-2.6.25-rc6_maxshort/ipc/util.h	2008-03-24 02:07:27.000000000 +0800
@@ -12,7 +12,6 @@
 
 #include <linux/err.h>
 
-#define USHRT_MAX 0xffff
 #define SEQ_MULTIPLIER	(IPCMNI)
 
 void sem_init (void);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-03-24  1:22 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20080317230516.078358225@sgi.com>
     [not found] ` <20080317230529.474353536@sgi.com>
2008-03-18 18:54   ` [patch 7/9] slub: Adjust order boundaries and minimum objects per slab Pekka Enberg
2008-03-18 19:00     ` Christoph Lameter
2008-03-19  1:04       ` Zhang, Yanmin
2008-03-19 15:20         ` Pekka Enberg
2008-03-20  6:44   ` Zhang, Yanmin
2008-03-20 18:32     ` Christoph Lameter
     [not found] ` <20080317230528.279983034@sgi.com>
2008-03-19  9:09   ` [patch 2/9] Store max number of objects in the page struct Zhang, Yanmin
2008-03-19 17:49     ` Christoph Lameter
2008-03-20  3:32       ` Zhang, Yanmin
2008-03-20 21:05         ` Christoph Lameter
2008-03-21 22:24         ` Andrew Morton
2008-03-22  3:27           ` Ben Pfaff
2008-03-24  1:22           ` [PATCH] Add definitions of USHORT_MAX and others Zhang, Yanmin
     [not found] ` <20080317230528.939792410@sgi.com>
2008-03-20  5:10   ` [patch 5/9] slub: Fallback to minimal order during slab page allocation Zhang, Yanmin
2008-03-20 18:29     ` Christoph Lameter
2008-03-21  0:52       ` Zhang, Yanmin
2008-03-21  3:35         ` Christoph Lameter
2008-03-21  5:14           ` Zhang, Yanmin
2008-03-21  6:07             ` Christoph Lameter
2008-03-21  8:23               ` Zhang, Yanmin
     [not found] ` <20080317230529.701336582@sgi.com>
2008-03-20  5:53   ` [patch 8/9] slub: Make the order configurable for each slab cache Zhang, Yanmin
2008-03-20 18:31     ` Christoph Lameter
2008-03-20 23:57       ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).