* [PATCH 00/10] Scalability requirements for sysv ipc - v2
@ 2008-04-29 14:33 Nadia.Derbey
2008-04-29 14:33 ` [PATCH 01/10] Fix idr_remove() Nadia.Derbey
` (10 more replies)
0 siblings, 11 replies; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm
After scalability problems had been detected when using the sysV ipcs, I
have proposed to use an RCU based implementation of the IDR api instead
(see thread http://lkml.org/lkml/2008/4/11/212).
Resending the patch series after
. integrating Paul's remarks
. fixing a bug I had introduced
. porting it to 2.6.25-mm1.
Reviewers are still welcome!
Patch 1 can be taken alone: it fixes a problem in the existing IDR API.
Patches should be applied on linux-2.6.25-mm1, in the following order:
[ PATCH 01/10 ] : idr_minor_fix.patch
[ PATCH 02/10 ] : ridr_structure.patch
[ PATCH 03/10 ] : ridr_pre_get.patch
[ PATCH 04/10 ] : ridr_init.patch
[ PATCH 05/10 ] : ridr_get_new_above.patch
[ PATCH 06/10 ] : ridr_get_new.patch
[ PATCH 07/10 ] : ridr_find.patch
[ PATCH 08/10 ] : ridr_remove.patch
[ PATCH 09/10 ] : ipc_use_ridr.patch
[ PATCH 10/10 ] : remove_ipc_lock_down.patch
Regards,
Nadia
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 01/10] Fix idr_remove()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-04-29 18:44 ` Andrew Morton
2008-04-29 14:33 ` [PATCH 02/10] Introduce the ridr structure Nadia.Derbey
` (9 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: idr_minor_fix.patch --]
[-- Type: text/plain, Size: 709 bytes --]
[PATCH 01/10]
This patch fixes idr_remove(): the return inside the loop makes us free only
a single layer.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
lib/idr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6.25-mm1/lib/idr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-25 15:29:00.000000000 +0200
+++ linux-2.6.25-mm1/lib/idr.c 2008-04-25 15:48:34.000000000 +0200
@@ -385,8 +385,8 @@ void idr_remove(struct idr *idp, int id)
while (idp->id_free_cnt >= IDR_FREE_MAX) {
p = alloc_layer(idp);
kmem_cache_free(idr_layer_cache, p);
- return;
}
+ return;
}
EXPORT_SYMBOL(idr_remove);
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 02/10] Introduce the ridr structure
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
2008-04-29 14:33 ` [PATCH 01/10] Fix idr_remove() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:30 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 03/10] Introduce ridr_pre_get() Nadia.Derbey
` (8 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_structure.patch --]
[-- Type: text/plain, Size: 4985 bytes --]
[PATCH 02/10]
This patch introduces the ridr structure as an RCU safe idr structure: a layer
is actually made of the existing idr layer, followed by the rcu head.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/idr.h | 6 +++---
include/linux/ridr.h | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
init/main.c | 2 ++
lib/Makefile | 2 +-
lib/idr.c | 2 +-
lib/ridr.c | 26 ++++++++++++++++++++++++++
6 files changed, 83 insertions(+), 5 deletions(-)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 13:08:00.000000000 +0200
@@ -0,0 +1,50 @@
+/*
+ * include/linux/ridr.h
+ *
+ * Small id to pointer translation service avoiding fixed sized
+ * tables. RCU-based implmentation of IDRs.
+ */
+
+#ifndef _RIDR_H_
+#define _RIDR_H_
+
+#include <linux/idr.h>
+#include <linux/rcupdate.h>
+
+/*
+ * The idr_layer part should stay at the beginning of the structure
+ */
+struct ridr_layer {
+ struct idr_layer idr;
+ struct rcu_head rcu_head;
+};
+
+struct ridr {
+ struct ridr_layer *top;
+ struct ridr_layer *id_free;
+ int layers;
+ int id_free_cnt;
+ spinlock_t lock;
+};
+
+#define RIDR_INIT(name) \
+{ \
+ .top = NULL, \
+ .id_free = NULL, \
+ .layers = 0, \
+ .id_free_cnt = 0, \
+ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
+}
+#define DEFINE_RIDR(name) struct ridr name = RIDR_INIT(name)
+
+#define idr_to_ridr(p) container_of((p), struct ridr_layer, idr)
+#define ridr_to_idr(p) (&((p)->idr))
+
+/*
+ * This is what we export.
+ */
+
+
+void __init ridr_init_cache(void);
+
+#endif /* _RIDR_H_ */
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 13:08:05.000000000 +0200
@@ -0,0 +1,26 @@
+/*
+ * RCU-based idr API
+ */
+
+
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/ridr.h>
+
+static struct kmem_cache *ridr_layer_cache;
+
+
+static void ridr_cache_ctor(struct kmem_cache *ridr_layer_cache,
+ void *ridr_layer)
+{
+ memset(ridr_layer, 0, sizeof(struct ridr_layer));
+}
+
+void __init ridr_init_cache(void)
+{
+ ridr_layer_cache = kmem_cache_create("ridr_layer_cache",
+ sizeof(struct ridr_layer), 0, SLAB_PANIC,
+ ridr_cache_ctor);
+}
+
Index: linux-2.6.25-mm1/lib/Makefile
===================================================================
--- linux-2.6.25-mm1.orig/lib/Makefile 2008-04-25 15:29:00.000000000 +0200
+++ linux-2.6.25-mm1/lib/Makefile 2008-04-25 15:57:39.000000000 +0200
@@ -6,7 +6,7 @@ lib-y := ctype.o string.o vsprintf.o cmd
rbtree.o radix-tree.o dump_stack.o \
idr.o int_sqrt.o extable.o prio_tree.o \
sha1.o irq_regs.o reciprocal_div.o argv_split.o \
- proportions.o prio_heap.o ratelimit.o
+ proportions.o prio_heap.o ratelimit.o ridr.o
ifdef CONFIG_FTRACE
# Do not profile string.o, since it may be used in early boot or vdso
Index: linux-2.6.25-mm1/init/main.c
===================================================================
--- linux-2.6.25-mm1.orig/init/main.c 2008-04-25 15:29:13.000000000 +0200
+++ linux-2.6.25-mm1/init/main.c 2008-04-25 15:58:15.000000000 +0200
@@ -61,6 +61,7 @@
#include <linux/sched.h>
#include <linux/signal.h>
#include <linux/idr.h>
+#include <linux/ridr.h>
#include <asm/io.h>
#include <asm/bugs.h>
@@ -639,6 +640,7 @@ asmlinkage void __init start_kernel(void
kmem_cache_init();
debug_objects_mem_init();
idr_init_cache();
+ ridr_init_cache();
setup_per_cpu_pageset();
numa_policy_init();
if (late_time_init)
Index: linux-2.6.25-mm1/lib/idr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-25 17:22:43.000000000 +0200
+++ linux-2.6.25-mm1/lib/idr.c 2008-04-29 13:08:05.000000000 +0200
@@ -342,7 +342,7 @@ static void sub_remove(struct idr *idp,
while ((shift > 0) && p) {
n = (id >> shift) & IDR_MASK;
__clear_bit(n, &p->bitmap);
- *++paa = &p->ary[n];
+ *++paa = (struct idr_layer **) &p->ary[n];
p = p->ary[n];
shift -= IDR_BITS;
}
Index: linux-2.6.25-mm1/include/linux/idr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/idr.h 2008-04-25 17:22:43.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/idr.h 2008-04-29 13:08:00.000000000 +0200
@@ -48,9 +48,9 @@
#define IDR_FREE_MAX MAX_LEVEL + MAX_LEVEL
struct idr_layer {
- unsigned long bitmap; /* A zero bit means "space here" */
- struct idr_layer *ary[1<<IDR_BITS];
- int count; /* When zero, we can release it */
+ unsigned long bitmap; /* A zero bit means "space here" */
+ void *ary[1<<IDR_BITS];
+ int count; /* When zero, we can release it */
};
struct idr {
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 03/10] Introduce ridr_pre_get()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
2008-04-29 14:33 ` [PATCH 01/10] Fix idr_remove() Nadia.Derbey
2008-04-29 14:33 ` [PATCH 02/10] Introduce the ridr structure Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:31 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 04/10] Introduce ridr_init() Nadia.Derbey
` (7 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_pre_get.patch --]
[-- Type: text/plain, Size: 2312 bytes --]
[PATCH 03/10]
This patch introduces the ridr_pre_get() routine.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/ridr.h | 1 +
lib/ridr.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 47 insertions(+)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ridr.h 2008-04-29 13:08:00.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 13:14:56.000000000 +0200
@@ -43,6 +43,7 @@ struct ridr {
/*
* This is what we export.
*/
+int ridr_pre_get(struct ridr *, gfp_t);
void __init ridr_init_cache(void);
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 13:08:05.000000000 +0200
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 13:17:59.000000000 +0200
@@ -11,6 +11,52 @@
static struct kmem_cache *ridr_layer_cache;
+/* only called when idp->lock is held */
+/* moves an ridr_layer to the free list */
+static void __move_to_free_list(struct ridr *idp, struct ridr_layer *p)
+{
+ p->idr.ary[0] = idp->id_free;
+ idp->id_free = p;
+ idp->id_free_cnt++;
+}
+
+static void move_to_free_list(struct ridr *idp, struct ridr_layer *p)
+{
+ unsigned long flags;
+
+ /*
+ * Depends on the return element being zeroed.
+ */
+ spin_lock_irqsave(&idp->lock, flags);
+ __move_to_free_list(idp, p);
+ spin_unlock_irqrestore(&idp->lock, flags);
+}
+
+/**
+ * ridr_pre_get - reserver resources for ridr allocation
+ * @idp: ridr handle
+ * @gfp_mask: memory allocation flags
+ *
+ * This function should be called prior to locking and calling the
+ * ridr_get_new* functions. It preallocates enough memory to satisfy
+ * the worst possible allocation.
+ *
+ * If the system is REALLY out of memory this function returns 0,
+ * otherwise 1.
+ */
+int ridr_pre_get(struct ridr *idp, gfp_t gfp_mask)
+{
+ while (idp->id_free_cnt < IDR_FREE_MAX) {
+ struct ridr_layer *new;
+ new = kmem_cache_alloc(ridr_layer_cache, gfp_mask);
+ if (new == NULL)
+ return (0);
+ move_to_free_list(idp, new);
+ }
+ return 1;
+}
+EXPORT_SYMBOL(ridr_pre_get);
+
static void ridr_cache_ctor(struct kmem_cache *ridr_layer_cache,
void *ridr_layer)
{
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 04/10] Introduce ridr_init()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (2 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 03/10] Introduce ridr_pre_get() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:31 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 05/10] Introduce ridr_get_new_above() Nadia.Derbey
` (6 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_init.patch --]
[-- Type: text/plain, Size: 1332 bytes --]
[PATCH 04/10]
This patch introduces the ridr_init() routine.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/ridr.h | 1 +
lib/ridr.c | 14 ++++++++++++++
2 files changed, 15 insertions(+)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ridr.h 2008-04-29 13:14:56.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 13:21:56.000000000 +0200
@@ -44,6 +44,7 @@ struct ridr {
* This is what we export.
*/
int ridr_pre_get(struct ridr *, gfp_t);
+void ridr_init(struct ridr *);
void __init ridr_init_cache(void);
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 13:17:59.000000000 +0200
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 13:23:17.000000000 +0200
@@ -70,3 +70,17 @@ void __init ridr_init_cache(void)
ridr_cache_ctor);
}
+/**
+ * ridr_init - initialize ridr handle
+ * @idp: ridr handle
+ *
+ * This function is used to set up the handle (@idp) that you will pass
+ * to the rest of the functions.
+ */
+void ridr_init(struct ridr *idp)
+{
+ memset(idp, 0, sizeof(struct ridr));
+ spin_lock_init(&idp->lock);
+}
+EXPORT_SYMBOL(ridr_init);
+
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 05/10] Introduce ridr_get_new_above()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (3 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 04/10] Introduce ridr_init() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 06/10] Introduce ridr_get_new() Nadia.Derbey
` (5 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_get_new_above.patch --]
[-- Type: text/plain, Size: 12016 bytes --]
[PATCH 05/10]
This patch introduces the ridr_get_new_above() routine, and some common code
between the idr an ridr API's.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/idr.h | 21 ++++++
include/linux/ridr.h | 1
lib/idr.c | 151 ++++++++++++++++++++++++++------------------
lib/ridr.c | 175 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 287 insertions(+), 61 deletions(-)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ridr.h 2008-04-29 13:21:56.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 14:03:35.000000000 +0200
@@ -44,6 +44,7 @@ struct ridr {
* This is what we export.
*/
int ridr_pre_get(struct ridr *, gfp_t);
+int ridr_get_new_above(struct ridr *, void *, int, int *);
void ridr_init(struct ridr *);
Index: linux-2.6.25-mm1/include/linux/idr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/idr.h 2008-04-29 13:08:00.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/idr.h 2008-04-29 14:08:47.000000000 +0200
@@ -71,6 +71,27 @@ struct idr {
}
#define DEFINE_IDR(name) struct idr name = IDR_INIT(name)
+/* Actions to be taken after a call to _idr_sub_alloc */
+#define IDR_DONE -1
+#define IDR_NEED_TO_GROW -2
+#define IDR_NOMORE_SPACE -3
+#define IDR_GO_TOP -4
+#define IDR_GO_UP -5
+
+#define _idr_rc_to_errno(rc) ((rc) == -1 ? -EAGAIN : -ENOSPC)
+
+static inline void _idr_set_new_slot(struct idr_layer *new,
+ struct idr_layer *p)
+{
+ new->ary[0] = p;
+ new->count = 1;
+ if (p->bitmap == IDR_FULL)
+ __set_bit(0, &new->bitmap);
+}
+
+void _idr_mark_full(struct idr_layer **, int);
+int _idr_sub_alloc(int *, int *, struct idr_layer **, struct idr_layer **);
+
/*
* This is what we export.
*/
Index: linux-2.6.25-mm1/lib/idr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-29 13:08:05.000000000 +0200
+++ linux-2.6.25-mm1/lib/idr.c 2008-04-29 14:06:20.000000000 +0200
@@ -70,7 +70,7 @@ static void free_layer(struct idr *idp,
spin_unlock_irqrestore(&idp->lock, flags);
}
-static void idr_mark_full(struct idr_layer **pa, int id)
+void _idr_mark_full(struct idr_layer **pa, int id)
{
struct idr_layer *p = pa[0];
int l = 0;
@@ -115,12 +115,71 @@ int idr_pre_get(struct idr *idp, gfp_t g
}
EXPORT_SYMBOL(idr_pre_get);
-static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
+int _idr_sub_alloc(int *cur_id, int *layer, struct idr_layer **cur_ptr,
+ struct idr_layer **pa)
{
int n, m, sh;
- struct idr_layer *p, *new;
- int l, id, oid;
+ int l = *layer;
+ int id = *cur_id;
unsigned long bm;
+ struct idr_layer *p = *cur_ptr;
+ int rc;
+
+ n = (id >> (IDR_BITS * l)) & IDR_MASK;
+ bm = ~p->bitmap;
+ m = find_next_bit(&bm, IDR_SIZE, n);
+ if (m == IDR_SIZE) {
+ int oid;
+
+ /* no space available go back to previous layer. */
+ l++;
+ oid = id;
+ id = (id | ((1 << (IDR_BITS * l)) - 1)) + 1;
+
+ /* if already at the top layer, we need to grow */
+ p = pa[l];
+ if (!p) {
+ *cur_id = id;
+ return IDR_NEED_TO_GROW;
+ }
+
+ /* If we need to go up one layer, continue the
+ * loop; otherwise, restart from the top.
+ */
+ *cur_id = id;
+ *layer = l;
+ *cur_ptr = p;
+ sh = IDR_BITS * (l + 1);
+ if (oid >> sh == id >> sh)
+ rc = IDR_GO_UP;
+ else
+ rc = IDR_GO_TOP;
+ goto end_sub_alloc;
+ }
+ if (m != n) {
+ sh = IDR_BITS * l;
+ id = ((id >> sh) ^ n ^ m) << sh;
+ }
+ if ((id >= MAX_ID_BIT) || (id < 0))
+ return IDR_NOMORE_SPACE;
+
+ if (l == 0)
+ rc = IDR_DONE;
+ else
+ rc = m;
+end_sub_alloc:
+ *cur_id = id;
+ *layer = l;
+ *cur_ptr = p;
+
+ return rc;
+}
+
+static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
+{
+ int m;
+ struct idr_layer *p, *new;
+ int l, id;
id = *starting_id;
restart:
@@ -131,38 +190,23 @@ static int sub_alloc(struct idr *idp, in
/*
* We run around this while until we reach the leaf node...
*/
- n = (id >> (IDR_BITS*l)) & IDR_MASK;
- bm = ~p->bitmap;
- m = find_next_bit(&bm, IDR_SIZE, n);
- if (m == IDR_SIZE) {
- /* no space available go back to previous layer. */
- l++;
- oid = id;
- id = (id | ((1 << (IDR_BITS * l)) - 1)) + 1;
-
- /* if already at the top layer, we need to grow */
- if (!(p = pa[l])) {
- *starting_id = id;
- return -2;
- }
-
- /* If we need to go up one layer, continue the
- * loop; otherwise, restart from the top.
- */
- sh = IDR_BITS * (l + 1);
- if (oid >> sh == id >> sh)
- continue;
- else
- goto restart;
- }
- if (m != n) {
- sh = IDR_BITS*l;
- id = ((id >> sh) ^ n ^ m) << sh;
- }
- if ((id >= MAX_ID_BIT) || (id < 0))
- return -3;
- if (l == 0)
+ int action = _idr_sub_alloc(&id, &l, &p, pa);
+ switch (action) {
+ case IDR_NEED_TO_GROW:
+ *starting_id = id;
+ case IDR_NOMORE_SPACE:
+ return action;
+ case IDR_DONE:
+ goto end_loop;
+ case IDR_GO_UP:
+ continue;
+ case IDR_GO_TOP:
+ goto restart;
+ default:
+ m = action;
break;
+ }
+ BUG_ON(m < 0);
/*
* Create the layer below if it is missing.
*/
@@ -175,7 +219,7 @@ static int sub_alloc(struct idr *idp, in
pa[l--] = p;
p = p->ary[m];
}
-
+end_loop:
pa[l] = p;
return id;
}
@@ -219,16 +263,13 @@ build_up:
spin_unlock_irqrestore(&idp->lock, flags);
return -1;
}
- new->ary[0] = p;
- new->count = 1;
- if (p->bitmap == IDR_FULL)
- __set_bit(0, &new->bitmap);
+ _idr_set_new_slot(new, p);
p = new;
}
idp->top = p;
idp->layers = layers;
v = sub_alloc(idp, &id, pa);
- if (v == -2)
+ if (v == IDR_NEED_TO_GROW)
goto build_up;
return(v);
}
@@ -246,7 +287,7 @@ static int idr_get_new_above_int(struct
*/
pa[0]->ary[id & IDR_MASK] = (struct idr_layer *)ptr;
pa[0]->count++;
- idr_mark_full(pa, id);
+ _idr_mark_full(pa, id);
}
return id;
@@ -277,12 +318,8 @@ int idr_get_new_above(struct idr *idp, v
* This is a cheap hack until the IDR code can be fixed to
* return proper error values.
*/
- if (rv < 0) {
- if (rv == -1)
- return -EAGAIN;
- else /* Will be -3 */
- return -ENOSPC;
- }
+ if (rv < 0)
+ return _idr_rc_to_errno(rv);
*id = rv;
return 0;
}
@@ -312,12 +349,8 @@ int idr_get_new(struct idr *idp, void *p
* This is a cheap hack until the IDR code can be fixed to
* return proper error values.
*/
- if (rv < 0) {
- if (rv == -1)
- return -EAGAIN;
- else /* Will be -3 */
- return -ENOSPC;
- }
+ if (rv < 0)
+ return _idr_rc_to_errno(rv);
*id = rv;
return 0;
}
@@ -694,12 +727,8 @@ int ida_get_new_above(struct ida *ida, i
restart:
/* get vacant slot */
t = idr_get_empty_slot(&ida->idr, idr_id, pa);
- if (t < 0) {
- if (t == -1)
- return -EAGAIN;
- else /* will be -3 */
- return -ENOSPC;
- }
+ if (t < 0)
+ return _idr_rc_to_errno(t);
if (t * IDA_BITMAP_BITS >= MAX_ID_BIT)
return -ENOSPC;
@@ -739,7 +768,7 @@ int ida_get_new_above(struct ida *ida, i
__set_bit(t, bitmap->bitmap);
if (++bitmap->nr_busy == IDA_BITMAP_BITS)
- idr_mark_full(pa, idr_id);
+ _idr_mark_full(pa, idr_id);
*p_id = id;
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 13:23:17.000000000 +0200
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 14:03:35.000000000 +0200
@@ -11,6 +11,23 @@
static struct kmem_cache *ridr_layer_cache;
+static struct ridr_layer *get_from_free_list(struct ridr *idp)
+{
+ struct ridr_layer *q;
+ struct idr_layer *p;
+ unsigned long flags;
+
+ spin_lock_irqsave(&idp->lock, flags);
+ if ((q = idp->id_free)) {
+ p = ridr_to_idr(q);
+ idp->id_free = p->ary[0];
+ idp->id_free_cnt--;
+ p->ary[0] = NULL;
+ }
+ spin_unlock_irqrestore(&idp->lock, flags);
+ return(q);
+}
+
/* only called when idp->lock is held */
/* moves an ridr_layer to the free list */
static void __move_to_free_list(struct ridr *idp, struct ridr_layer *p)
@@ -57,6 +74,164 @@ int ridr_pre_get(struct ridr *idp, gfp_t
}
EXPORT_SYMBOL(ridr_pre_get);
+static int sub_alloc(struct ridr *idp, int *starting_id,
+ struct ridr_layer **rpa, struct idr_layer **pa)
+{
+ int m;
+ struct ridr_layer *q, *new;
+ struct idr_layer *p;
+ int l, id;
+
+ id = *starting_id;
+ restart:
+ q = idp->top;
+ p = ridr_to_idr(q);
+ l = idp->layers;
+ pa[l] = NULL;
+ rpa[l--] = NULL;
+ while (1) {
+ /*
+ * We run around this while until we reach the leaf node...
+ */
+ int action = _idr_sub_alloc(&id, &l, &p, pa);
+ switch (action) {
+ case IDR_NEED_TO_GROW:
+ *starting_id = id;
+ case IDR_NOMORE_SPACE:
+ return action;
+ case IDR_DONE:
+ goto end_loop;
+ case IDR_GO_UP:
+ continue;
+ case IDR_GO_TOP:
+ goto restart;
+ default:
+ m = action;
+ break;
+ }
+ BUG_ON(m < 0);
+ /*
+ * Create the layer below if it is missing.
+ */
+ if (!p->ary[m]) {
+ new = get_from_free_list(idp);
+ if (!new)
+ return -1;
+ rcu_assign_pointer(p->ary[m], new);
+ p->count++;
+ }
+ pa[l] = p;
+ rpa[l--] = idr_to_ridr(p);
+ p = p->ary[m];
+ }
+
+end_loop:
+ pa[l] = p;
+ rpa[l] = idr_to_ridr(p);
+ return id;
+}
+
+static int ridr_get_empty_slot(struct ridr *idp, int starting_id,
+ struct ridr_layer **rpa, struct idr_layer **pa)
+{
+ struct ridr_layer *p, *rnew;
+ int layers, v, id;
+ unsigned long flags;
+
+ id = starting_id;
+build_up:
+ p = idp->top;
+ layers = idp->layers;
+ if (unlikely(!p)) {
+ p = get_from_free_list(idp);
+ if (!p)
+ return -1;
+ layers = 1;
+ }
+ /*
+ * Add a new layer to the top of the tree if the requested
+ * id is larger than the currently allocated space.
+ */
+ while (layers < MAX_LEVEL - 1 && id >= (1 << (layers * IDR_BITS))) {
+ layers++;
+ if (!p->idr.count)
+ continue;
+ rnew = get_from_free_list(idp);
+ if (!rnew) {
+ /*
+ * The allocation failed. If we built part of
+ * the structure tear it down.
+ */
+ spin_lock_irqsave(&idp->lock, flags);
+ for (rnew = p; p && p != idp->top; rnew = p) {
+ p = p->idr.ary[0];
+ rnew->idr.ary[0] = NULL;
+ rnew->idr.bitmap = rnew->idr.count = 0;
+ __move_to_free_list(idp, rnew);
+ }
+ spin_unlock_irqrestore(&idp->lock, flags);
+ return -1;
+ }
+ _idr_set_new_slot(ridr_to_idr(rnew), ridr_to_idr(p));
+ p = rnew;
+ }
+ rcu_assign_pointer(idp->top, p);
+ idp->layers = layers;
+ v = sub_alloc(idp, &id, rpa, pa);
+ if (v == IDR_NEED_TO_GROW)
+ goto build_up;
+ return(v);
+}
+
+static int ridr_get_new_above_int(struct ridr *idp, void *ptr, int starting_id)
+{
+ struct ridr_layer *rpa[MAX_LEVEL];
+ struct idr_layer *pa[MAX_LEVEL];
+ int id;
+
+ id = ridr_get_empty_slot(idp, starting_id, rpa, pa);
+ if (id >= 0) {
+ /*
+ * Successfully found an empty slot. Install the user
+ * pointer and mark the slot full.
+ */
+ rcu_assign_pointer(pa[0]->ary[id & IDR_MASK],
+ (struct ridr_layer *)ptr);
+ pa[0]->count++;
+ _idr_mark_full(pa, id);
+ }
+
+ return id;
+}
+
+/**
+ * ridr_get_new_above - allocate new ridr entry above or equal to a start id
+ * @idp: ridr handle
+ * @ptr: pointer you want associated with the ide
+ * @start_id: id to start search at
+ * @id: pointer to the allocated handle
+ *
+ * This is the allocate id function. It should be called with any
+ * required locks.
+ *
+ * If memory is required, it will return -EAGAIN, you should unlock
+ * and go back to the ridr_pre_get() call. If the ridr is full, it will
+ * return -ENOSPC.
+ *
+ * @id returns a value in the range 0 ... 0x7fffffff
+ */
+int ridr_get_new_above(struct ridr *idp, void *ptr, int starting_id, int *id)
+{
+ int rv;
+
+ rv = ridr_get_new_above_int(idp, ptr, starting_id);
+ if (rv < 0)
+ return _idr_rc_to_errno(rv);
+ *id = rv;
+ return 0;
+}
+EXPORT_SYMBOL(ridr_get_new_above);
+
static void ridr_cache_ctor(struct kmem_cache *ridr_layer_cache,
void *ridr_layer)
{
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 06/10] Introduce ridr_get_new()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (4 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 05/10] Introduce ridr_get_new_above() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 07/10] Introduce ridr_find() Nadia.Derbey
` (4 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_get_new.patch --]
[-- Type: text/plain, Size: 1916 bytes --]
[PATCH 06/10]
This patch introduces the ridr_get_new() routine.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/ridr.h | 1 +
lib/ridr.c | 27 +++++++++++++++++++++++++++
2 files changed, 28 insertions(+)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ridr.h 2008-04-29 14:03:35.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 14:16:09.000000000 +0200
@@ -44,6 +44,7 @@ struct ridr {
* This is what we export.
*/
int ridr_pre_get(struct ridr *, gfp_t);
+int ridr_get_new(struct ridr *, void *, int *);
int ridr_get_new_above(struct ridr *, void *, int, int *);
void ridr_init(struct ridr *);
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 14:03:35.000000000 +0200
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 14:17:05.000000000 +0200
@@ -232,6 +232,33 @@ int ridr_get_new_above(struct ridr *idp,
}
EXPORT_SYMBOL(ridr_get_new_above);
+/**
+ * ridr_get_new - allocate new ridr entry
+ * @idp: ridr handle
+ * @ptr: pointer you want associated with the ide
+ * @id: pointer to the allocated handle
+ *
+ * This is the allocate id function. It should be called with any
+ * required locks.
+ *
+ * If memory is required, it will return -EAGAIN, you should unlock
+ * and go back to the ridr_pre_get() call. If the ridr is full, it will
+ * return -ENOSPC.
+ *
+ * @id returns a value in the range 0 ... 0x7fffffff
+ */
+int ridr_get_new(struct ridr *idp, void *ptr, int *id)
+{
+ int rv;
+
+ rv = ridr_get_new_above_int(idp, ptr, 0);
+ if (rv < 0)
+ return _idr_rc_to_errno(rv);
+ *id = rv;
+ return 0;
+}
+EXPORT_SYMBOL(ridr_get_new);
+
static void ridr_cache_ctor(struct kmem_cache *ridr_layer_cache,
void *ridr_layer)
{
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 07/10] Introduce ridr_find()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (5 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 06/10] Introduce ridr_get_new() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 08/10] Introduce ridr_remove() Nadia.Derbey
` (3 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_find.patch --]
[-- Type: text/plain, Size: 2848 bytes --]
[PATCH 07/10]
This patch introduces the ridr_find() routine.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/ridr.h | 17 +++++++++++++++++
lib/ridr.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ridr.h 2008-04-29 14:16:09.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 14:31:43.000000000 +0200
@@ -40,9 +40,26 @@ struct ridr {
#define idr_to_ridr(p) container_of((p), struct ridr_layer, idr)
#define ridr_to_idr(p) (&((p)->idr))
+/**
+ * Ridr synchronization (see radix-tree.h)
+ *
+ * ridr_find() is able to be called locklessly, using RCU. The caller must
+ * ensure calls to this function are made within rcu_read_lock() regions.
+ * Other readers (lock-free or otherwise) and modifications may be running
+ * concurrently.
+ *
+ * It is still required that the caller manage the synchronization and
+ * lifetimes of the items. So if RCU lock-free lookups are used, typically
+ * this would mean that the items have their own locks, or are amenable to
+ * lock-free access; and that the items are freed by RCU (or only freed after
+ * having been deleted from the ridr tree *and* a synchronize_rcu() grace
+ * period).
+ */
+
/*
* This is what we export.
*/
+void *ridr_find(struct ridr *, int);
int ridr_pre_get(struct ridr *, gfp_t);
int ridr_get_new(struct ridr *, void *, int *);
int ridr_get_new_above(struct ridr *, void *, int, int *);
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 14:17:05.000000000 +0200
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 14:32:26.000000000 +0200
@@ -259,6 +259,40 @@ int ridr_get_new(struct ridr *idp, void
}
EXPORT_SYMBOL(ridr_get_new);
+/**
+ * ridr_find - return pointer for given id
+ * @idp: ridr handle
+ * @id: lookup key
+ *
+ * Return the pointer given the id it has been registered with. A %NULL
+ * return indicates that @id is not valid or you passed %NULL in
+ * ridr_get_new().
+ *
+ * This function can be called under rcu_read_lock(), given that the leaf
+ * pointers lifetimes are correctly managed.
+ */
+void *ridr_find(struct ridr *idp, int id)
+{
+ int n;
+ struct ridr_layer *p;
+
+ n = idp->layers * IDR_BITS;
+ p = rcu_dereference(idp->top);
+
+ /* Mask off upper bits we don't use for the search. */
+ id &= MAX_ID_MASK;
+
+ if (id >= (1 << n))
+ return NULL;
+
+ while (n > 0 && p) {
+ n -= IDR_BITS;
+ p = rcu_dereference(p->idr.ary[(id >> n) & IDR_MASK]);
+ }
+ return((void *)p);
+}
+EXPORT_SYMBOL(ridr_find);
+
static void ridr_cache_ctor(struct kmem_cache *ridr_layer_cache,
void *ridr_layer)
{
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 08/10] Introduce ridr_remove()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (6 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 07/10] Introduce ridr_find() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 09/10] Integrate the ridr code into IPC code Nadia.Derbey
` (2 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ridr_remove.patch --]
[-- Type: text/plain, Size: 4993 bytes --]
[PATCH 08/10]
This patch introduces the ridr_remove() routine.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/idr.h | 1
include/linux/ridr.h | 1
lib/idr.c | 7 ++--
lib/ridr.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 89 insertions(+), 3 deletions(-)
Index: linux-2.6.25-mm1/include/linux/ridr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ridr.h 2008-04-29 14:31:43.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 14:34:53.000000000 +0200
@@ -63,6 +63,7 @@ void *ridr_find(struct ridr *, int);
int ridr_pre_get(struct ridr *, gfp_t);
int ridr_get_new(struct ridr *, void *, int *);
int ridr_get_new_above(struct ridr *, void *, int, int *);
+void ridr_remove(struct ridr *, int);
void ridr_init(struct ridr *);
Index: linux-2.6.25-mm1/include/linux/idr.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/idr.h 2008-04-29 14:08:47.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/idr.h 2008-04-29 14:35:37.000000000 +0200
@@ -91,6 +91,7 @@ static inline void _idr_set_new_slot(str
void _idr_mark_full(struct idr_layer **, int);
int _idr_sub_alloc(int *, int *, struct idr_layer **, struct idr_layer **);
+void _idr_remove_warning(const char *, int);
/*
* This is what we export.
Index: linux-2.6.25-mm1/lib/idr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-29 14:06:20.000000000 +0200
+++ linux-2.6.25-mm1/lib/idr.c 2008-04-29 14:40:12.000000000 +0200
@@ -356,9 +356,10 @@ int idr_get_new(struct idr *idp, void *p
}
EXPORT_SYMBOL(idr_get_new);
-static void idr_remove_warning(int id)
+void _idr_remove_warning(const char *caller, int id)
{
- printk("idr_remove called for id=%d which is not allocated.\n", id);
+ printk(KERN_INFO "%s called for id=%d which is not allocated.\n",
+ caller, id);
dump_stack();
}
@@ -390,7 +391,7 @@ static void sub_remove(struct idr *idp,
if (!*paa)
idp->layers = 0;
} else
- idr_remove_warning(id);
+ _idr_remove_warning("idr_remove", id);
}
/**
Index: linux-2.6.25-mm1/lib/ridr.c
===================================================================
--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 14:32:26.000000000 +0200
+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 14:41:57.000000000 +0200
@@ -28,6 +28,19 @@ static struct ridr_layer *get_from_free_
return(q);
}
+static void ridr_layer_rcu_free(struct rcu_head *head)
+{
+ struct ridr_layer *layer;
+
+ layer = container_of(head, struct ridr_layer, rcu_head);
+ kmem_cache_free(ridr_layer_cache, layer);
+}
+
+static inline void free_layer(struct ridr_layer *p)
+{
+ call_rcu(&p->rcu_head, ridr_layer_rcu_free);
+}
+
/* only called when idp->lock is held */
/* moves an ridr_layer to the free list */
static void __move_to_free_list(struct ridr *idp, struct ridr_layer *p)
@@ -259,6 +272,76 @@ int ridr_get_new(struct ridr *idp, void
}
EXPORT_SYMBOL(ridr_get_new);
+static void sub_remove(struct ridr *idp, int shift, int id)
+{
+ struct ridr_layer *p = idp->top;
+ struct ridr_layer **pa[MAX_LEVEL];
+ struct ridr_layer ***paa = &pa[0];
+ struct ridr_layer *to_free;
+ int n;
+
+ *paa = NULL;
+ *++paa = &idp->top;
+
+ while ((shift > 0) && p) {
+ n = (id >> shift) & IDR_MASK;
+ __clear_bit(n, &p->idr.bitmap);
+ *++paa = (struct ridr_layer **) &(p->idr.ary[n]);
+ p = p->idr.ary[n];
+ shift -= IDR_BITS;
+ }
+ n = id & IDR_MASK;
+ if (likely(p != NULL && test_bit(n, &p->idr.bitmap))) {
+ __clear_bit(n, &p->idr.bitmap);
+ rcu_assign_pointer(p->idr.ary[n], NULL);
+ to_free = NULL;
+ while (*paa && !--((**paa)->idr.count)) {
+ if (to_free)
+ free_layer(to_free);
+ to_free = **paa;
+ **paa-- = NULL;
+ }
+ if (!*paa)
+ idp->layers = 0;
+ if (to_free)
+ free_layer(to_free);
+ } else
+ _idr_remove_warning("ridr_remove", id);
+}
+/**
+ * ridr_remove - remove the given id and free it's slot
+ * @idp: ridr handle
+ * @id: unique key
+ */
+void ridr_remove(struct ridr *idp, int id)
+{
+ struct ridr_layer *p;
+ struct ridr_layer *to_free;
+
+ /* Mask off upper bits we don't use for the search. */
+ id &= MAX_ID_MASK;
+
+ sub_remove(idp, (idp->layers - 1) * IDR_BITS, id);
+ if (idp->top && idp->top->idr.count == 1 && (idp->layers > 1) &&
+ idp->top->idr.ary[0]) {
+ /*
+ * Single child at leftmost slot: we can shrink the tree.
+ * This level is not needed anymore since when layers are
+ * inserted, they are inserted at the top of the existing
+ * tree.
+ */
+ to_free = idp->top;
+ p = idp->top->idr.ary[0];
+ rcu_assign_pointer(idp->top, p);
+ --idp->layers;
+ to_free->idr.bitmap = to_free->idr.count = 0;
+ to_free->idr.ary[0] = NULL;
+ free_layer(to_free);
+ }
+ return;
+}
+EXPORT_SYMBOL(ridr_remove);
+
/**
* ridr_find - return pointer for given id
* @idp: ridr handle
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 09/10] Integrate the ridr code into IPC code
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (7 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 08/10] Introduce ridr_remove() Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 10/10] Get rid of ipc_lock_down() Nadia.Derbey
2008-04-29 18:54 ` [PATCH 00/10] Scalability requirements for sysv ipc - v2 Andrew Morton
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: ipc_use_ridr.patch --]
[-- Type: text/plain, Size: 7179 bytes --]
[PATCH 09/10]
This patche makes the ipc ode use the ridr API.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
include/linux/ipc_namespace.h | 4 ++--
ipc/namespace.c | 2 +-
ipc/shm.c | 2 +-
ipc/util.c | 41 ++++++++++++++++-------------------------
4 files changed, 20 insertions(+), 29 deletions(-)
Index: linux-2.6.25-mm1/include/linux/ipc_namespace.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/ipc_namespace.h 2008-04-25 15:29:00.000000000 +0200
+++ linux-2.6.25-mm1/include/linux/ipc_namespace.h 2008-04-29 14:46:34.000000000 +0200
@@ -2,7 +2,7 @@
#define __IPC_NAMESPACE_H__
#include <linux/err.h>
-#include <linux/idr.h>
+#include <linux/ridr.h>
#include <linux/rwsem.h>
#include <linux/notifier.h>
@@ -21,7 +21,7 @@ struct ipc_ids {
unsigned short seq;
unsigned short seq_max;
struct rw_semaphore rw_mutex;
- struct idr ipcs_idr;
+ struct ridr ipcs_ridr;
};
struct ipc_namespace {
Index: linux-2.6.25-mm1/ipc/namespace.c
===================================================================
--- linux-2.6.25-mm1.orig/ipc/namespace.c 2008-04-25 15:29:13.000000000 +0200
+++ linux-2.6.25-mm1/ipc/namespace.c 2008-04-29 14:47:25.000000000 +0200
@@ -74,7 +74,7 @@ void free_ipcs(struct ipc_namespace *ns,
in_use = ids->in_use;
for (total = 0, next_id = 0; total < in_use; next_id++) {
- perm = idr_find(&ids->ipcs_idr, next_id);
+ perm = ridr_find(&ids->ipcs_ridr, next_id);
if (perm == NULL)
continue;
ipc_lock_by_ptr(perm);
Index: linux-2.6.25-mm1/ipc/shm.c
===================================================================
--- linux-2.6.25-mm1.orig/ipc/shm.c 2008-04-25 15:29:13.000000000 +0200
+++ linux-2.6.25-mm1/ipc/shm.c 2008-04-29 14:48:26.000000000 +0200
@@ -569,7 +569,7 @@ static void shm_get_stat(struct ipc_name
struct shmid_kernel *shp;
struct inode *inode;
- shp = idr_find(&shm_ids(ns).ipcs_idr, next_id);
+ shp = ridr_find(&shm_ids(ns).ipcs_ridr, next_id);
if (shp == NULL)
continue;
Index: linux-2.6.25-mm1/ipc/util.c
===================================================================
--- linux-2.6.25-mm1.orig/ipc/util.c 2008-04-25 15:29:13.000000000 +0200
+++ linux-2.6.25-mm1/ipc/util.c 2008-04-29 14:53:14.000000000 +0200
@@ -122,7 +122,7 @@ __initcall(ipc_init);
* @ids: Identifier set
*
* Set up the sequence range to use for the ipc identifier range (limited
- * below IPCMNI) then initialise the ids idr.
+ * below IPCMNI) then initialise the ids ridr.
*/
void ipc_init_ids(struct ipc_ids *ids)
@@ -139,7 +139,7 @@ void ipc_init_ids(struct ipc_ids *ids)
ids->seq_max = seq_limit;
}
- idr_init(&ids->ipcs_idr);
+ ridr_init(&ids->ipcs_ridr);
}
#ifdef CONFIG_PROC_FS
@@ -194,7 +194,7 @@ static struct kern_ipc_perm *ipc_findkey
int total;
for (total = 0, next_id = 0; total < ids->in_use; next_id++) {
- ipc = idr_find(&ids->ipcs_idr, next_id);
+ ipc = ridr_find(&ids->ipcs_ridr, next_id);
if (ipc == NULL)
continue;
@@ -233,7 +233,7 @@ int ipc_get_maxid(struct ipc_ids *ids)
/* Look for the last assigned id */
total = 0;
for (id = 0; id < IPCMNI && total < ids->in_use; id++) {
- ipc = idr_find(&ids->ipcs_idr, id);
+ ipc = ridr_find(&ids->ipcs_ridr, id);
if (ipc != NULL) {
max_id = id;
total++;
@@ -248,7 +248,7 @@ int ipc_get_maxid(struct ipc_ids *ids)
* @new: new IPC permission set
* @size: limit for the number of used ids
*
- * Add an entry 'new' to the IPC ids idr. The permissions object is
+ * Add an entry 'new' to the IPC ids ridr. The permissions object is
* initialised and the first free entry is set up and the id assigned
* is returned. The 'new' entry is returned in a locked state on success.
* On failure the entry is not locked and a negative err-code is returned.
@@ -266,7 +266,7 @@ int ipc_addid(struct ipc_ids* ids, struc
if (ids->in_use >= size)
return -ENOSPC;
- err = idr_get_new(&ids->ipcs_idr, new, &id);
+ err = ridr_get_new(&ids->ipcs_ridr, new, &id);
if (err)
return err;
@@ -302,7 +302,7 @@ static int ipcget_new(struct ipc_namespa
{
int err;
retry:
- err = idr_pre_get(&ids->ipcs_idr, GFP_KERNEL);
+ err = ridr_pre_get(&ids->ipcs_ridr, GFP_KERNEL);
if (!err)
return -ENOMEM;
@@ -368,7 +368,7 @@ static int ipcget_public(struct ipc_name
int flg = params->flg;
int err;
retry:
- err = idr_pre_get(&ids->ipcs_idr, GFP_KERNEL);
+ err = ridr_pre_get(&ids->ipcs_ridr, GFP_KERNEL);
/*
* Take the lock as a writer since we are potentially going to add
@@ -424,7 +424,7 @@ void ipc_rmid(struct ipc_ids *ids, struc
{
int lid = ipcid_to_idx(ipcp->id);
- idr_remove(&ids->ipcs_idr, lid);
+ ridr_remove(&ids->ipcs_ridr, lid);
ids->in_use--;
@@ -685,13 +685,9 @@ void ipc64_perm_to_ipc_perm (struct ipc6
* @ids: IPC identifier set
* @id: ipc id to look for
*
- * Look for an id in the ipc ids idr and lock the associated ipc object.
+ * Look for an id in the ipc ids ridr and lock the associated ipc object.
*
* The ipc object is locked on exit.
- *
- * This is the routine that should be called when the rw_mutex is not already
- * held, i.e. idr tree not protected: it protects the idr tree in read mode
- * during the idr_find().
*/
struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id)
@@ -699,18 +695,13 @@ struct kern_ipc_perm *ipc_lock(struct ip
struct kern_ipc_perm *out;
int lid = ipcid_to_idx(id);
- down_read(&ids->rw_mutex);
-
rcu_read_lock();
- out = idr_find(&ids->ipcs_idr, lid);
+ out = ridr_find(&ids->ipcs_ridr, lid);
if (out == NULL) {
rcu_read_unlock();
- up_read(&ids->rw_mutex);
return ERR_PTR(-EINVAL);
}
- up_read(&ids->rw_mutex);
-
spin_lock(&out->lock);
/* ipc_rmid() may have already freed the ID while ipc_lock
@@ -730,12 +721,12 @@ struct kern_ipc_perm *ipc_lock(struct ip
* @ids: IPC identifier set
* @id: ipc id to look for
*
- * Look for an id in the ipc ids idr and lock the associated ipc object.
+ * Look for an id in the ipc ids ridr and lock the associated ipc object.
*
* The ipc object is locked on exit.
*
* This is the routine that should be called when the rw_mutex is already
- * held, i.e. idr tree protected.
+ * held, i.e. ridr tree protected.
*/
struct kern_ipc_perm *ipc_lock_down(struct ipc_ids *ids, int id)
@@ -744,7 +735,7 @@ struct kern_ipc_perm *ipc_lock_down(stru
int lid = ipcid_to_idx(id);
rcu_read_lock();
- out = idr_find(&ids->ipcs_idr, lid);
+ out = ridr_find(&ids->ipcs_ridr, lid);
if (out == NULL) {
rcu_read_unlock();
return ERR_PTR(-EINVAL);
@@ -915,7 +906,7 @@ static struct kern_ipc_perm *sysvipc_fin
total = 0;
for (id = 0; id < pos && total < ids->in_use; id++) {
- ipc = idr_find(&ids->ipcs_idr, id);
+ ipc = ridr_find(&ids->ipcs_ridr, id);
if (ipc != NULL)
total++;
}
@@ -924,7 +915,7 @@ static struct kern_ipc_perm *sysvipc_fin
return NULL;
for ( ; pos < IPCMNI; pos++) {
- ipc = idr_find(&ids->ipcs_idr, pos);
+ ipc = ridr_find(&ids->ipcs_ridr, pos);
if (ipc != NULL) {
*new_pos = pos + 1;
ipc_lock_by_ptr(ipc);
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 10/10] Get rid of ipc_lock_down()
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (8 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 09/10] Integrate the ridr code into IPC code Nadia.Derbey
@ 2008-04-29 14:33 ` Nadia.Derbey
2008-05-01 4:33 ` Tim Pepper
2008-04-29 18:54 ` [PATCH 00/10] Scalability requirements for sysv ipc - v2 Andrew Morton
10 siblings, 1 reply; 26+ messages in thread
From: Nadia.Derbey @ 2008-04-29 14:33 UTC (permalink / raw)
To: manfred, paulmck; +Cc: linux-kernel, efault, akpm, Nadia Derbey
[-- Attachment #1: remove_ipc_lock_down.patch --]
[-- Type: text/plain, Size: 4569 bytes --]
[PATCH 10/10]
With the previous patch ipc_lock() becomes lockless.
So there is no need anymore for ipc_lock_down()-like routines.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
---
ipc/shm.c | 21 +++------------------
ipc/util.c | 52 +---------------------------------------------------
ipc/util.h | 6 ------
3 files changed, 4 insertions(+), 75 deletions(-)
Index: linux-2.6.25-mm1/ipc/util.c
===================================================================
--- linux-2.6.25-mm1.orig/ipc/util.c 2008-04-29 14:53:14.000000000 +0200
+++ linux-2.6.25-mm1/ipc/util.c 2008-04-29 14:59:06.000000000 +0200
@@ -716,56 +716,6 @@ struct kern_ipc_perm *ipc_lock(struct ip
return out;
}
-/**
- * ipc_lock_down - Lock an ipc structure with rw_sem held
- * @ids: IPC identifier set
- * @id: ipc id to look for
- *
- * Look for an id in the ipc ids ridr and lock the associated ipc object.
- *
- * The ipc object is locked on exit.
- *
- * This is the routine that should be called when the rw_mutex is already
- * held, i.e. ridr tree protected.
- */
-
-struct kern_ipc_perm *ipc_lock_down(struct ipc_ids *ids, int id)
-{
- struct kern_ipc_perm *out;
- int lid = ipcid_to_idx(id);
-
- rcu_read_lock();
- out = ridr_find(&ids->ipcs_ridr, lid);
- if (out == NULL) {
- rcu_read_unlock();
- return ERR_PTR(-EINVAL);
- }
-
- spin_lock(&out->lock);
-
- /*
- * No need to verify that the structure is still valid since the
- * rw_mutex is held.
- */
- return out;
-}
-
-struct kern_ipc_perm *ipc_lock_check_down(struct ipc_ids *ids, int id)
-{
- struct kern_ipc_perm *out;
-
- out = ipc_lock_down(ids, id);
- if (IS_ERR(out))
- return out;
-
- if (ipc_checkid(out, id)) {
- ipc_unlock(out);
- return ERR_PTR(-EIDRM);
- }
-
- return out;
-}
-
struct kern_ipc_perm *ipc_lock_check(struct ipc_ids *ids, int id)
{
struct kern_ipc_perm *out;
@@ -837,7 +787,7 @@ struct kern_ipc_perm *ipcctl_pre_down(st
int err;
down_write(&ids->rw_mutex);
- ipcp = ipc_lock_check_down(ids, id);
+ ipcp = ipc_lock_check(ids, id);
if (IS_ERR(ipcp)) {
err = PTR_ERR(ipcp);
goto out_up;
Index: linux-2.6.25-mm1/ipc/util.h
===================================================================
--- linux-2.6.25-mm1.orig/ipc/util.h 2008-04-25 15:29:13.000000000 +0200
+++ linux-2.6.25-mm1/ipc/util.h 2008-04-29 14:59:49.000000000 +0200
@@ -102,11 +102,6 @@ void* ipc_rcu_alloc(int size);
void ipc_rcu_getref(void *ptr);
void ipc_rcu_putref(void *ptr);
-/*
- * ipc_lock_down: called with rw_mutex held
- * ipc_lock: called without that lock held
- */
-struct kern_ipc_perm *ipc_lock_down(struct ipc_ids *, int);
struct kern_ipc_perm *ipc_lock(struct ipc_ids *, int);
void kernel_to_ipc64_perm(struct kern_ipc_perm *in, struct ipc64_perm *out);
@@ -155,7 +150,6 @@ static inline void ipc_unlock(struct ker
rcu_read_unlock();
}
-struct kern_ipc_perm *ipc_lock_check_down(struct ipc_ids *ids, int id);
struct kern_ipc_perm *ipc_lock_check(struct ipc_ids *ids, int id);
int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids,
struct ipc_ops *ops, struct ipc_params *params);
Index: linux-2.6.25-mm1/ipc/shm.c
===================================================================
--- linux-2.6.25-mm1.orig/ipc/shm.c 2008-04-29 14:48:26.000000000 +0200
+++ linux-2.6.25-mm1/ipc/shm.c 2008-04-29 15:01:24.000000000 +0200
@@ -112,23 +112,8 @@ void __init shm_init (void)
}
/*
- * shm_lock_(check_)down routines are called in the paths where the rw_mutex
- * is held to protect access to the idr tree.
- */
-static inline struct shmid_kernel *shm_lock_down(struct ipc_namespace *ns,
- int id)
-{
- struct kern_ipc_perm *ipcp = ipc_lock_down(&shm_ids(ns), id);
-
- if (IS_ERR(ipcp))
- return (struct shmid_kernel *)ipcp;
-
- return container_of(ipcp, struct shmid_kernel, shm_perm);
-}
-
-/*
* shm_lock_(check_) routines are called in the paths where the rw_mutex
- * is not held.
+ * is not necessarily held.
*/
static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
{
@@ -211,7 +196,7 @@ static void shm_close(struct vm_area_str
down_write(&shm_ids(ns).rw_mutex);
/* remove from the list of attaches of the shm segment */
- shp = shm_lock_down(ns, sfd->id);
+ shp = shm_lock(ns, sfd->id);
BUG_ON(IS_ERR(shp));
shp->shm_lprid = task_tgid_vnr(current);
shp->shm_dtim = get_seconds();
@@ -933,7 +918,7 @@ invalid:
out_nattch:
down_write(&shm_ids(ns).rw_mutex);
- shp = shm_lock_down(ns, shmid);
+ shp = shm_lock(ns, shmid);
BUG_ON(IS_ERR(shp));
shp->shm_nattch--;
if(shp->shm_nattch == 0 &&
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 01/10] Fix idr_remove()
2008-04-29 14:33 ` [PATCH 01/10] Fix idr_remove() Nadia.Derbey
@ 2008-04-29 18:44 ` Andrew Morton
2008-05-05 9:26 ` Nadia Derbey
0 siblings, 1 reply; 26+ messages in thread
From: Andrew Morton @ 2008-04-29 18:44 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, Nadia.Derbey
On Tue, 29 Apr 2008 16:33:05 +0200
Nadia.Derbey@bull.net wrote:
> [PATCH 01/10]
>
> This patch fixes idr_remove(): the return inside the loop makes us free only
> a single layer.
>
> Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
>
> ---
> lib/idr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6.25-mm1/lib/idr.c
> ===================================================================
> --- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-25 15:29:00.000000000 +0200
> +++ linux-2.6.25-mm1/lib/idr.c 2008-04-25 15:48:34.000000000 +0200
> @@ -385,8 +385,8 @@ void idr_remove(struct idr *idp, int id)
> while (idp->id_free_cnt >= IDR_FREE_MAX) {
> p = alloc_layer(idp);
> kmem_cache_free(idr_layer_cache, p);
> - return;
> }
> + return;
> }
> EXPORT_SYMBOL(idr_remove);
erk, ancient bug.
I _think_ the implications of this are that an idr tree will grow fatter
than it needs to be, but there is no permanent leak: idr_destroy() will
still free everything, yes?
And a consequence of the fix is that idr manipulations will now result in
more allocs and frees, but the amount of memory which a tree uses will be
less?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 00/10] Scalability requirements for sysv ipc - v2
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
` (9 preceding siblings ...)
2008-04-29 14:33 ` [PATCH 10/10] Get rid of ipc_lock_down() Nadia.Derbey
@ 2008-04-29 18:54 ` Andrew Morton
2008-05-05 16:52 ` Nadia Derbey
10 siblings, 1 reply; 26+ messages in thread
From: Andrew Morton @ 2008-04-29 18:54 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault
On Tue, 29 Apr 2008 16:33:04 +0200
Nadia.Derbey@bull.net wrote:
> After scalability problems had been detected when using the sysV ipcs, I
> have proposed to use an RCU based implementation of the IDR api instead
> (see thread http://lkml.org/lkml/2008/4/11/212).
>
> Resending the patch series after
> . integrating Paul's remarks
> . fixing a bug I had introduced
> . porting it to 2.6.25-mm1.
>
> Reviewers are still welcome!
>
> Patch 1 can be taken alone: it fixes a problem in the existing IDR API.
>
> Patches should be applied on linux-2.6.25-mm1, in the following order:
>
> [ PATCH 01/10 ] : idr_minor_fix.patch
> [ PATCH 02/10 ] : ridr_structure.patch
> [ PATCH 03/10 ] : ridr_pre_get.patch
> [ PATCH 04/10 ] : ridr_init.patch
> [ PATCH 05/10 ] : ridr_get_new_above.patch
> [ PATCH 06/10 ] : ridr_get_new.patch
> [ PATCH 07/10 ] : ridr_find.patch
> [ PATCH 08/10 ] : ridr_remove.patch
> [ PATCH 09/10 ] : ipc_use_ridr.patch
> [ PATCH 10/10 ] : remove_ipc_lock_down.patch
>
hm, you've been busy.
As all this work is entirely for performance reasons, the changelog should
have (lots of) performance testing results! Please.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 02/10] Introduce the ridr structure
2008-04-29 14:33 ` [PATCH 02/10] Introduce the ridr structure Nadia.Derbey
@ 2008-05-01 4:30 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:30 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:06 +0200 Nadia.Derbey@bull.net said:
> Index: linux-2.6.25-mm1/include/linux/ridr.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6.25-mm1/include/linux/ridr.h 2008-04-29 13:08:00.000000000 +0200
> @@ -0,0 +1,50 @@
> +/*
> + * include/linux/ridr.h
> + *
> + * Small id to pointer translation service avoiding fixed sized
> + * tables. RCU-based implmentation of IDRs.
> + */
^^^^^^^
s/implmentation/reimplementation
Additional comment here might be nice here since this amounts to a mostly
copy of idr.h and its backing implementation, with all the associated
maintenance headaches if both are in tree.
> +/*
> + * The idr_layer part should stay at the beginning of the structure
> + */
> +struct ridr_layer {
> + struct idr_layer idr;
> + struct rcu_head rcu_head;
> +};
Why does it have to be at the beginning? So an ridr_layer can be used as
an idr_layer? This doesn't help review. As previous commenters noted,
review would be a lot easier if this was incremental on idr.
> Index: linux-2.6.25-mm1/lib/Makefile
> ===================================================================
> --- linux-2.6.25-mm1.orig/lib/Makefile 2008-04-25 15:29:00.000000000 +0200
> +++ linux-2.6.25-mm1/lib/Makefile 2008-04-25 15:57:39.000000000 +0200
> @@ -6,7 +6,7 @@ lib-y := ctype.o string.o vsprintf.o cmd
> rbtree.o radix-tree.o dump_stack.o \
> idr.o int_sqrt.o extable.o prio_tree.o \
> sha1.o irq_regs.o reciprocal_div.o argv_split.o \
> - proportions.o prio_heap.o ratelimit.o
> + proportions.o prio_heap.o ratelimit.o ridr.o
I vaguely recall past discussion of preferred insertion point, but
lacking something in CodingStyle and especially given the redundancy
here it could be nice to add ridr.o next to idr.o.
> ===================================================================
> --- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-25 17:22:43.000000000 +0200
> +++ linux-2.6.25-mm1/lib/idr.c 2008-04-29 13:08:05.000000000 +0200
> @@ -342,7 +342,7 @@ static void sub_remove(struct idr *idp,
> while ((shift > 0) && p) {
> n = (id >> shift) & IDR_MASK;
> __clear_bit(n, &p->bitmap);
> - *++paa = &p->ary[n];
> + *++paa = (struct idr_layer **) &p->ary[n];
> p = p->ary[n];
> shift -= IDR_BITS;
> }
For shifty use of an ridr_layer as an idr_layer?
> Index: linux-2.6.25-mm1/include/linux/idr.h
> ===================================================================
> --- linux-2.6.25-mm1.orig/include/linux/idr.h 2008-04-25 17:22:43.000000000 +0200
> +++ linux-2.6.25-mm1/include/linux/idr.h 2008-04-29 13:08:00.000000000 +0200
> @@ -48,9 +48,9 @@
> #define IDR_FREE_MAX MAX_LEVEL + MAX_LEVEL
>
> struct idr_layer {
> - unsigned long bitmap; /* A zero bit means "space here" */
> - struct idr_layer *ary[1<<IDR_BITS];
> - int count; /* When zero, we can release it */
> + unsigned long bitmap; /* A zero bit means "space here" */
> + void *ary[1<<IDR_BITS];
> + int count; /* When zero, we can release it */
> };
>
> struct idr {
>
This hunk is white space only...
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 03/10] Introduce ridr_pre_get()
2008-04-29 14:33 ` [PATCH 03/10] Introduce ridr_pre_get() Nadia.Derbey
@ 2008-05-01 4:31 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:31 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:07 +0200 Nadia.Derbey@bull.net said:
> [PATCH 03/10]
>
> This patch introduces the ridr_pre_get() routine.
There's almost nothing here that isn't 1:1 duplication of idr.
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 04/10] Introduce ridr_init()
2008-04-29 14:33 ` [PATCH 04/10] Introduce ridr_init() Nadia.Derbey
@ 2008-05-01 4:31 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:31 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:08 +0200 Nadia.Derbey@bull.net said:
> [PATCH 04/10]
>
> This patch introduces the ridr_init() routine.
Again nothing added.
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 05/10] Introduce ridr_get_new_above()
2008-04-29 14:33 ` [PATCH 05/10] Introduce ridr_get_new_above() Nadia.Derbey
@ 2008-05-01 4:32 ` Tim Pepper
2008-05-05 10:33 ` Nadia Derbey
0 siblings, 1 reply; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:32 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:09 +0200 Nadia.Derbey@bull.net said:
> [PATCH 05/10]
>
> This patch introduces the ridr_get_new_above() routine, and some common code
> between the idr an ridr API's.
The ridr_get_new_above() is the first place we see something really
different compared to idr (an RCU addition). This is a lot of patching
so far for what would be a small incremental change otherwise.
> Index: linux-2.6.25-mm1/include/linux/idr.h
> ===================================================================
> --- linux-2.6.25-mm1.orig/include/linux/idr.h 2008-04-29 13:08:00.000000000 +0200
> +++ linux-2.6.25-mm1/include/linux/idr.h 2008-04-29 14:08:47.000000000 +0200
> @@ -71,6 +71,27 @@ struct idr {
> }
> #define DEFINE_IDR(name) struct idr name = IDR_INIT(name)
>
> +/* Actions to be taken after a call to _idr_sub_alloc */
> +#define IDR_DONE -1
> +#define IDR_NEED_TO_GROW -2
> +#define IDR_NOMORE_SPACE -3
> +#define IDR_GO_TOP -4
> +#define IDR_GO_UP -5
This stuff is useful on its own in as much as it improves code
readability. A lot of this patch (the "some common code between the
idr and ridr API's" part) could be a standalone patch distinct from the
ridr series and then be at the head of your stack.
> + return action;
> + case IDR_DONE:
> + goto end_loop;
> + case IDR_GO_UP:
> + continue;
> + case IDR_GO_TOP:
> + goto restart;
> + default:
> + m = action;
> break;
> + }
> + BUG_ON(m < 0);
Why the added BUG_ON()? These couple hunks are a bit muddled, so maybe I
missed something. But I don't see anything different in how m or bm are
being manipulated such that m<0 is anymore likely after this patch.
> Index: linux-2.6.25-mm1/lib/ridr.c
> ===================================================================
> --- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 13:23:17.000000000 +0200
> +++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 14:03:35.000000000 +0200
> @@ -11,6 +11,23 @@
> static struct kmem_cache *ridr_layer_cache;
>
>
> +static struct ridr_layer *get_from_free_list(struct ridr *idp)
> +{
> + struct ridr_layer *q;
> + struct idr_layer *p;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&idp->lock, flags);
> + if ((q = idp->id_free)) {
> + p = ridr_to_idr(q);
> + idp->id_free = p->ary[0];
> + idp->id_free_cnt--;
> + p->ary[0] = NULL;
> + }
> + spin_unlock_irqrestore(&idp->lock, flags);
> + return(q);
> +}
> +
idr's alloc_layer() in disguise?
> +static int sub_alloc(struct ridr *idp, int *starting_id,
> + struct ridr_layer **rpa, struct idr_layer **pa)
...
More or less duplication
...
> + * Create the layer below if it is missing.
> + */
> + if (!p->ary[m]) {
> + new = get_from_free_list(idp);
> + if (!new)
> + return -1;
> + rcu_assign_pointer(p->ary[m], new);
> + p->count++;
> + }
> + pa[l] = p;
> + rpa[l--] = idr_to_ridr(p);
> + p = p->ary[m];
> + }
> +
> +end_loop:
> + pa[l] = p;
> + rpa[l] = idr_to_ridr(p);
> + return id;
> +}
Oh but wait..there's some RCU-ness tucked in there.
> +
> +static int ridr_get_empty_slot(struct ridr *idp, int starting_id,
> + struct ridr_layer **rpa, struct idr_layer **pa)
> +{
> + struct ridr_layer *p, *rnew;
> + int layers, v, id;
> + unsigned long flags;
> +
> + id = starting_id;
> +build_up:
> + p = idp->top;
> + layers = idp->layers;
> + if (unlikely(!p)) {
> + p = get_from_free_list(idp);
> + if (!p)
> + return -1;
> + layers = 1;
> + }
> + /*
> + * Add a new layer to the top of the tree if the requested
> + * id is larger than the currently allocated space.
> + */
> + while (layers < MAX_LEVEL - 1 && id >= (1 << (layers * IDR_BITS))) {
^^ ^^
Dropped some parens. Otherwise more duplication...
> + rnew->idr.ary[0] = NULL;
> + rnew->idr.bitmap = rnew->idr.count = 0;
> + __move_to_free_list(idp, rnew);
> + }
> + spin_unlock_irqrestore(&idp->lock, flags);
> + return -1;
> + }
> + _idr_set_new_slot(ridr_to_idr(rnew), ridr_to_idr(p));
> + p = rnew;
> + }
> + rcu_assign_pointer(idp->top, p);
> + idp->layers = layers;
> + v = sub_alloc(idp, &id, rpa, pa);
> + if (v == IDR_NEED_TO_GROW)
> + goto build_up;
> + return(v);
> +}
Some more RCU.
> +static int ridr_get_new_above_int(struct ridr *idp, void *ptr, int starting_id)
> +{
> + struct ridr_layer *rpa[MAX_LEVEL];
> + struct idr_layer *pa[MAX_LEVEL];
> + int id;
> +
> + id = ridr_get_empty_slot(idp, starting_id, rpa, pa);
> + if (id >= 0) {
> + /*
> + * Successfully found an empty slot. Install the user
> + * pointer and mark the slot full.
> + */
> + rcu_assign_pointer(pa[0]->ary[id & IDR_MASK],
> + (struct ridr_layer *)ptr);
> + pa[0]->count++;
> + _idr_mark_full(pa, id);
> + }
> +
> + return id;
> +}
And other line of RCU.
OK. So at this point in patch 5/10 we've got 3 lines of new code and
hundreds of lines of duplicated code?
A while more looking through the rest of the patches for the rest of the
context and I might be able to actually think about the implications of
these three lines being where they are.
Locking changes are complicated enough without all this obfuscation!
I understand the desire to not break IDR, but...
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 06/10] Introduce ridr_get_new()
2008-04-29 14:33 ` [PATCH 06/10] Introduce ridr_get_new() Nadia.Derbey
@ 2008-05-01 4:32 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:32 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:10 +0200 Nadia.Derbey@bull.net said:
> [PATCH 06/10]
>
> This patch introduces the ridr_get_new() routine.
Duplication + small useful cleanup that could be standalone.
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 07/10] Introduce ridr_find()
2008-04-29 14:33 ` [PATCH 07/10] Introduce ridr_find() Nadia.Derbey
@ 2008-05-01 4:32 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:32 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:11 +0200 Nadia.Derbey@bull.net said:
> [PATCH 07/10]
>
> This patch introduces the ridr_find() routine.
Finally this one has some meat. But, besides the comments, this could be
around two lines added to idr instead of many more duplicated.
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 08/10] Introduce ridr_remove()
2008-04-29 14:33 ` [PATCH 08/10] Introduce ridr_remove() Nadia.Derbey
@ 2008-05-01 4:32 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:32 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:12 +0200 Nadia.Derbey@bull.net said:
> [PATCH 08/10]
>
> This patch introduces the ridr_remove() routine.
A bit of cleanup and roughly 2 lines RCU.
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 09/10] Integrate the ridr code into IPC code
2008-04-29 14:33 ` [PATCH 09/10] Integrate the ridr code into IPC code Nadia.Derbey
@ 2008-05-01 4:32 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:32 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:13 +0200 Nadia.Derbey@bull.net said:
> [PATCH 09/10]
>
> This patche makes the ipc ode use the ridr API.
Superfluous except...
> @@ -699,18 +695,13 @@ struct kern_ipc_perm *ipc_lock(struct ip
> struct kern_ipc_perm *out;
> int lid = ipcid_to_idx(id);
>
> - down_read(&ids->rw_mutex);
> -
> rcu_read_lock();
> - out = idr_find(&ids->ipcs_idr, lid);
> + out = ridr_find(&ids->ipcs_ridr, lid);
> if (out == NULL) {
> rcu_read_unlock();
> - up_read(&ids->rw_mutex);
> return ERR_PTR(-EINVAL);
> }
>
> - up_read(&ids->rw_mutex);
> -
...the removal of the rw_mutex down/up_read().
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 10/10] Get rid of ipc_lock_down()
2008-04-29 14:33 ` [PATCH 10/10] Get rid of ipc_lock_down() Nadia.Derbey
@ 2008-05-01 4:33 ` Tim Pepper
0 siblings, 0 replies; 26+ messages in thread
From: Tim Pepper @ 2008-05-01 4:33 UTC (permalink / raw)
To: Nadia.Derbey; +Cc: manfred, paulmck, linux-kernel, efault, akpm
On Tue 29 Apr at 16:33:14 +0200 Nadia.Derbey@bull.net said:
> [PATCH 10/10]
>
> With the previous patch ipc_lock() becomes lockless.
> So there is no need anymore for ipc_lock_down()-like routines.
...cleanup post conversion to RCU.
Could we get a version of the series without all the duplication and
without the cleanups (separate into standalone pre- and post- patches)?
If you're too busy, now that I've sifted through these I could make a
simpler series that does this.
It will be a looooot easier to discuss the locking when all we're looking
at is the locking changes.
--
Tim Pepper <lnxninja@linux.vnet.ibm.com>
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 01/10] Fix idr_remove()
2008-04-29 18:44 ` Andrew Morton
@ 2008-05-05 9:26 ` Nadia Derbey
0 siblings, 0 replies; 26+ messages in thread
From: Nadia Derbey @ 2008-05-05 9:26 UTC (permalink / raw)
To: Andrew Morton; +Cc: manfred, paulmck, linux-kernel, efault
Andrew Morton wrote:
> On Tue, 29 Apr 2008 16:33:05 +0200
> Nadia.Derbey@bull.net wrote:
>
>
>>[PATCH 01/10]
>>
>>This patch fixes idr_remove(): the return inside the loop makes us free only
>>a single layer.
>>
>>Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
>>
>>---
>> lib/idr.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>>Index: linux-2.6.25-mm1/lib/idr.c
>>===================================================================
>>--- linux-2.6.25-mm1.orig/lib/idr.c 2008-04-25 15:29:00.000000000 +0200
>>+++ linux-2.6.25-mm1/lib/idr.c 2008-04-25 15:48:34.000000000 +0200
>>@@ -385,8 +385,8 @@ void idr_remove(struct idr *idp, int id)
>> while (idp->id_free_cnt >= IDR_FREE_MAX) {
>> p = alloc_layer(idp);
>> kmem_cache_free(idr_layer_cache, p);
>>- return;
>> }
>>+ return;
>> }
>> EXPORT_SYMBOL(idr_remove);
>
>
> erk, ancient bug.
>
> I _think_ the implications of this are that an idr tree will grow fatter
> than it needs to be, but there is no permanent leak: idr_destroy() will
> still free everything, yes?
Yes, exactly. Actually, I've not checked whether all the kernel
components call idr_destroy() when needed.
>
> And a consequence of the fix is that idr manipulations will now result in
> more allocs and frees,
Not necessarily more allocs: this loop keeps IDR_FREE_MAX layers in the
free list. So idr_pre_get() should be a noop.
> but the amount of memory which a tree uses will be
> less?
>
>
>
Regards,
Nadia
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 05/10] Introduce ridr_get_new_above()
2008-05-01 4:32 ` Tim Pepper
@ 2008-05-05 10:33 ` Nadia Derbey
0 siblings, 0 replies; 26+ messages in thread
From: Nadia Derbey @ 2008-05-05 10:33 UTC (permalink / raw)
To: Tim Pepper; +Cc: manfred, paulmck, linux-kernel, efault, akpm
Tim Pepper wrote:
> On Tue 29 Apr at 16:33:09 +0200 Nadia.Derbey@bull.net said:
>
>>[PATCH 05/10]
>>
>>This patch introduces the ridr_get_new_above() routine, and some common code
>>between the idr an ridr API's.
>
>
> The ridr_get_new_above() is the first place we see something really
> different compared to idr (an RCU addition). This is a lot of patching
> so far for what would be a small incremental change otherwise.
>
>
>>Index: linux-2.6.25-mm1/include/linux/idr.h
>>===================================================================
>>--- linux-2.6.25-mm1.orig/include/linux/idr.h 2008-04-29 13:08:00.000000000 +0200
>>+++ linux-2.6.25-mm1/include/linux/idr.h 2008-04-29 14:08:47.000000000 +0200
>>@@ -71,6 +71,27 @@ struct idr {
>> }
>> #define DEFINE_IDR(name) struct idr name = IDR_INIT(name)
>>
>>+/* Actions to be taken after a call to _idr_sub_alloc */
>>+#define IDR_DONE -1
>>+#define IDR_NEED_TO_GROW -2
>>+#define IDR_NOMORE_SPACE -3
>>+#define IDR_GO_TOP -4
>>+#define IDR_GO_UP -5
>
>
> This stuff is useful on its own in as much as it improves code
> readability. A lot of this patch (the "some common code between the
> idr and ridr API's" part) could be a standalone patch distinct from the
> ridr series and then be at the head of your stack.
>
>
>>+ return action;
>>+ case IDR_DONE:
>>+ goto end_loop;
>>+ case IDR_GO_UP:
>>+ continue;
>>+ case IDR_GO_TOP:
>>+ goto restart;
>>+ default:
>>+ m = action;
>> break;
>>+ }
>>+ BUG_ON(m < 0);
>
>
> Why the added BUG_ON()? These couple hunks are a bit muddled, so maybe I
> missed something. But I don't see anything different in how m or bm are
> being manipulated such that m<0 is anymore likely after this patch.
>
>
>>Index: linux-2.6.25-mm1/lib/ridr.c
>>===================================================================
>>--- linux-2.6.25-mm1.orig/lib/ridr.c 2008-04-29 13:23:17.000000000 +0200
>>+++ linux-2.6.25-mm1/lib/ridr.c 2008-04-29 14:03:35.000000000 +0200
>>@@ -11,6 +11,23 @@
>> static struct kmem_cache *ridr_layer_cache;
>>
>>
>>+static struct ridr_layer *get_from_free_list(struct ridr *idp)
>>+{
>>+ struct ridr_layer *q;
>>+ struct idr_layer *p;
>>+ unsigned long flags;
>>+
>>+ spin_lock_irqsave(&idp->lock, flags);
>>+ if ((q = idp->id_free)) {
>>+ p = ridr_to_idr(q);
>>+ idp->id_free = p->ary[0];
>>+ idp->id_free_cnt--;
>>+ p->ary[0] = NULL;
>>+ }
>>+ spin_unlock_irqrestore(&idp->lock, flags);
>>+ return(q);
>>+}
>>+
>
>
> idr's alloc_layer() in disguise?
>
>
>>+static int sub_alloc(struct ridr *idp, int *starting_id,
>>+ struct ridr_layer **rpa, struct idr_layer **pa)
>
>
> ...
> More or less duplication
> ...
>
>
>>+ * Create the layer below if it is missing.
>>+ */
>>+ if (!p->ary[m]) {
>>+ new = get_from_free_list(idp);
>>+ if (!new)
>>+ return -1;
>>+ rcu_assign_pointer(p->ary[m], new);
>>+ p->count++;
>>+ }
>>+ pa[l] = p;
>>+ rpa[l--] = idr_to_ridr(p);
>>+ p = p->ary[m];
>>+ }
>>+
>>+end_loop:
>>+ pa[l] = p;
>>+ rpa[l] = idr_to_ridr(p);
>>+ return id;
>>+}
>
>
> Oh but wait..there's some RCU-ness tucked in there.
>
>
>>+
>>+static int ridr_get_empty_slot(struct ridr *idp, int starting_id,
>>+ struct ridr_layer **rpa, struct idr_layer **pa)
>>+{
>>+ struct ridr_layer *p, *rnew;
>>+ int layers, v, id;
>>+ unsigned long flags;
>>+
>>+ id = starting_id;
>>+build_up:
>>+ p = idp->top;
>>+ layers = idp->layers;
>>+ if (unlikely(!p)) {
>>+ p = get_from_free_list(idp);
>>+ if (!p)
>>+ return -1;
>>+ layers = 1;
>>+ }
>>+ /*
>>+ * Add a new layer to the top of the tree if the requested
>>+ * id is larger than the currently allocated space.
>>+ */
>>+ while (layers < MAX_LEVEL - 1 && id >= (1 << (layers * IDR_BITS))) {
>
> ^^ ^^
> Dropped some parens. Otherwise more duplication...
>
>
>>+ rnew->idr.ary[0] = NULL;
>>+ rnew->idr.bitmap = rnew->idr.count = 0;
>>+ __move_to_free_list(idp, rnew);
>>+ }
>>+ spin_unlock_irqrestore(&idp->lock, flags);
>>+ return -1;
>>+ }
>>+ _idr_set_new_slot(ridr_to_idr(rnew), ridr_to_idr(p));
>>+ p = rnew;
>>+ }
>>+ rcu_assign_pointer(idp->top, p);
>>+ idp->layers = layers;
>>+ v = sub_alloc(idp, &id, rpa, pa);
>>+ if (v == IDR_NEED_TO_GROW)
>>+ goto build_up;
>>+ return(v);
>>+}
>
>
> Some more RCU.
>
>
>>+static int ridr_get_new_above_int(struct ridr *idp, void *ptr, int starting_id)
>>+{
>>+ struct ridr_layer *rpa[MAX_LEVEL];
>>+ struct idr_layer *pa[MAX_LEVEL];
>>+ int id;
>>+
>>+ id = ridr_get_empty_slot(idp, starting_id, rpa, pa);
>>+ if (id >= 0) {
>>+ /*
>>+ * Successfully found an empty slot. Install the user
>>+ * pointer and mark the slot full.
>>+ */
>>+ rcu_assign_pointer(pa[0]->ary[id & IDR_MASK],
>>+ (struct ridr_layer *)ptr);
>>+ pa[0]->count++;
>>+ _idr_mark_full(pa, id);
>>+ }
>>+
>>+ return id;
>>+}
>
>
> And other line of RCU.
>
> OK. So at this point in patch 5/10 we've got 3 lines of new code and
> hundreds of lines of duplicated code?
>
> A while more looking through the rest of the patches for the rest of the
> context and I might be able to actually think about the implications of
> these three lines being where they are.
>
> Locking changes are complicated enough without all this obfuscation!
> I understand the desire to not break IDR, but...
>
OK, you convinced me. I'll re-send a new patchset that is incremental on
top of idr. Sorry (and thanks a lot) for the painful review!
Regards,
Nadia
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 00/10] Scalability requirements for sysv ipc - v2
2008-04-29 18:54 ` [PATCH 00/10] Scalability requirements for sysv ipc - v2 Andrew Morton
@ 2008-05-05 16:52 ` Nadia Derbey
2008-05-05 17:07 ` Andrew Morton
0 siblings, 1 reply; 26+ messages in thread
From: Nadia Derbey @ 2008-05-05 16:52 UTC (permalink / raw)
To: Andrew Morton; +Cc: manfred, paulmck, linux-kernel, efault
[-- Attachment #1: Type: text/plain, Size: 1879 bytes --]
Andrew Morton wrote:
> On Tue, 29 Apr 2008 16:33:04 +0200
> Nadia.Derbey@bull.net wrote:
>
>
>>After scalability problems had been detected when using the sysV ipcs, I
>>have proposed to use an RCU based implementation of the IDR api instead
>>(see thread http://lkml.org/lkml/2008/4/11/212).
>>
>>Resending the patch series after
>> . integrating Paul's remarks
>> . fixing a bug I had introduced
>> . porting it to 2.6.25-mm1.
>>
>>Reviewers are still welcome!
>>
>>Patch 1 can be taken alone: it fixes a problem in the existing IDR API.
>>
>>Patches should be applied on linux-2.6.25-mm1, in the following order:
>>
>>[ PATCH 01/10 ] : idr_minor_fix.patch
>>[ PATCH 02/10 ] : ridr_structure.patch
>>[ PATCH 03/10 ] : ridr_pre_get.patch
>>[ PATCH 04/10 ] : ridr_init.patch
>>[ PATCH 05/10 ] : ridr_get_new_above.patch
>>[ PATCH 06/10 ] : ridr_get_new.patch
>>[ PATCH 07/10 ] : ridr_find.patch
>>[ PATCH 08/10 ] : ridr_remove.patch
>>[ PATCH 09/10 ] : ipc_use_ridr.patch
>>[ PATCH 10/10 ] : remove_ipc_lock_down.patch
>>
>
>
> hm, you've been busy.
>
> As all this work is entirely for performance reasons, the changelog should
> have (lots of) performance testing results! Please.
>
>
Here are the results I've got for the pmsg test sent by Manfred:
for i in 1 2 3 4 5 6 7 8;do ./pmsg $i 5;done > output.25_mm1.ref.8
for i in 1 2 3 4 5 6 7 8;do ./pmsg $i 5;done > output.25_mm1.patch.8
2.6.25-mm1 (ref) 2.6.25-mm1(patched)
Total (1 cpu): 876000 985496
Total (2 cpus): 1549592 1740591
Total (3 cpus): 1694370 2327468
Total (4 cpus): 404553 2465787
Total (5 cpus): 391283 3215676
Total (6 cpus): 263249 3849121
Total (7 cpus): 191742 4209866
Total (8 cpus): 141722 4255585
Full output files attached.
Will send the rest tomorrow.
Regards
Nadia
[-- Attachment #2: output.25_mm1.ref.8 --]
[-- Type: text/x-troff-man, Size: 5846 bytes --]
pmsg [nr queues] [timeout]
Using 1 queues/cpus (2 threads) for 5 seconds.
thread 1: sysvmsg 0 type 0 bound to 0001h
thread 0: sysvmsg 0 type 1 bound to 0001h
Result matrix:
Thread 0: 438000 1: 438000
Total: 876000
pmsg [nr queues] [timeout]
Using 2 queues/cpus (4 threads) for 5 seconds.
thread 0: sysvmsg 32768 type 1 bound to 0001h
thread 1: sysvmsg 65537 type 1 bound to 0002h
thread 3: sysvmsg 65537 type 0 bound to 0002h
thread 2: sysvmsg 32768 type 0 bound to 0001h
Result matrix:
Thread 0: 385675 2: 385675
Thread 1: 389121 3: 389121
Total: 1549592
pmsg [nr queues] [timeout]
Using 3 queues/cpus (6 threads) for 5 seconds.
thread 0: sysvmsg 98304 type 1 bound to 0001h
thread 3: sysvmsg 98304 type 0 bound to 0001h
thread 1: sysvmsg 131073 type 1 bound to 0002h
thread 2: sysvmsg 163842 type 1 bound to 0004h
thread 5: sysvmsg 163842 type 0 bound to 0004h
thread 4: sysvmsg 131073 type 0 bound to 0002h
Result matrix:
Thread 0: 279690 3: 279690
Thread 1: 283228 4: 283229
Thread 2: 284266 5: 284267
Total: 1694370
pmsg [nr queues] [timeout]
Using 4 queues/cpus (8 threads) for 5 seconds.
thread 0: sysvmsg 196608 type 1 bound to 0001h
thread 1: sysvmsg 229377 type 1 bound to 0002h
thread 2: sysvmsg 262146 type 1 bound to 0004h
thread 3: sysvmsg 294915 type 1 bound to 0008h
thread 4: sysvmsg 196608 type 0 bound to 0001h
thread 7: sysvmsg 294915 type 0 bound to 0008h
thread 6: sysvmsg 262146 type 0 bound to 0004h
thread 5: sysvmsg 229377 type 0 bound to 0002h
Result matrix:
Thread 0: 50305 4: 50305
Thread 1: 50168 5: 50168
Thread 2: 50922 6: 50922
Thread 3: 50881 7: 50882
Total: 404553
pmsg [nr queues] [timeout]
Using 5 queues/cpus (10 threads) for 5 seconds.
thread 0: sysvmsg 327680 type 1 bound to 0001h
thread 1: sysvmsg 360449 type 1 bound to 0002h
thread 2: sysvmsg 393218 type 1 bound to 0004h
thread 3: sysvmsg 425987 type 1 bound to 0008h
thread 4: sysvmsg 458756 type 1 bound to 0010h
thread 8: sysvmsg 425987 type 0 bound to 0008h
thread 7: sysvmsg 393218 type 0 bound to 0004h
thread 6: sysvmsg 360449 type 0 bound to 0002h
thread 5: sysvmsg 327680 type 0 bound to 0001h
thread 9: sysvmsg 458756 type 0 bound to 0010h
Result matrix:
Thread 0: 29684 5: 29684
Thread 1: 30236 6: 30237
Thread 2: 30902 7: 30902
Thread 3: 30699 8: 30700
Thread 4: 74119 9: 74120
Total: 391283
pmsg [nr queues] [timeout]
Using 6 queues/cpus (12 threads) for 5 seconds.
thread 0: sysvmsg 491520 type 1 bound to 0001h
thread 2: sysvmsg 557058 type 1 bound to 0004h
thread 7: sysvmsg 524289 type 0 bound to 0002h
thread 1: sysvmsg 524289 type 1 bound to 0002h
thread 8: sysvmsg 557058 type 0 bound to 0004h
thread 3: sysvmsg 589827 type 1 bound to 0008h
thread 6: sysvmsg 491520 type 0 bound to 0001h
thread 4: sysvmsg 622596 type 1 bound to 0010h
thread 11: sysvmsg 655365 type 0 bound to 0020h
thread 5: sysvmsg 655365 type 1 bound to 0020h
thread 10: sysvmsg 622596 type 0 bound to 0010h
thread 9: sysvmsg 589827 type 0 bound to 0008h
Result matrix:
Thread 0: 17229 6: 17230
Thread 1: 17442 7: 17442
Thread 2: 17337 8: 17337
Thread 3: 17420 9: 17421
Thread 4: 31020 10: 31020
Thread 5: 31175 11: 31176
Total: 263249
pmsg [nr queues] [timeout]
Using 7 queues/cpus (14 threads) for 5 seconds.
thread 0: sysvmsg 688128 type 1 bound to 0001h
thread 1: sysvmsg 720897 type 1 bound to 0002h
thread 2: sysvmsg 753666 type 1 bound to 0004h
thread 3: sysvmsg 786435 type 1 bound to 0008h
thread 4: sysvmsg 819204 type 1 bound to 0010h
thread 5: sysvmsg 851973 type 1 bound to 0020h
thread 6: sysvmsg 884742 type 1 bound to 0040h
thread 8: sysvmsg 720897 type 0 bound to 0002h
thread 10: sysvmsg 786435 type 0 bound to 0008h
thread 9: sysvmsg 753666 type 0 bound to 0004h
thread 11: sysvmsg 819204 type 0 bound to 0010h
thread 7: sysvmsg 688128 type 0 bound to 0001h
thread 13: sysvmsg 884742 type 0 bound to 0040h
thread 12: sysvmsg 851973 type 0 bound to 0020h
Result matrix:
Thread 0: 12058 7: 12058
Thread 1: 12054 8: 12055
Thread 2: 12023 9: 12023
Thread 3: 12062 10: 12062
Thread 4: 15940 11: 15940
Thread 5: 15725 12: 15726
Thread 6: 16008 13: 16008
Total: 191742
pmsg [nr queues] [timeout]
Using 8 queues/cpus (16 threads) for 5 seconds.
thread 1: sysvmsg 950273 type 1 bound to 0002h
thread 2: sysvmsg 983042 type 1 bound to 0004h
thread 3: sysvmsg 1015811 type 1 bound to 0008h
thread 4: sysvmsg 1048580 type 1 bound to 0010h
thread 0: sysvmsg 917504 type 1 bound to 0001h
thread 8: sysvmsg 917504 type 0 bound to 0001h
thread 10: sysvmsg 983042 type 0 bound to 0004h
thread 9: sysvmsg 950273 type 0 bound to 0002h
thread 12: sysvmsg 1048580 type 0 bound to 0010h
thread 11: sysvmsg 1015811 type 0 bound to 0008h
thread 5: sysvmsg 1081349 type 1 bound to 0020h
thread 6: sysvmsg 1114118 type 1 bound to 0040h
thread 13: sysvmsg 1081349 type 0 bound to 0020h
thread 7: sysvmsg 1146887 type 1 bound to 0080h
thread 14: sysvmsg 1114118 type 0 bound to 0040h
thread 15: sysvmsg 1146887 type 0 bound to 0080h
Result matrix:
Thread 0: 8757 8: 8757
Thread 1: 8819 9: 8819
Thread 2: 8848 10: 8849
Thread 3: 8832 11: 8832
Thread 4: 8930 12: 8931
Thread 5: 8893 13: 8894
Thread 6: 8894 14: 8894
Thread 7: 8886 15: 8887
Total: 141722
[-- Attachment #3: output.25_mm1.patch.8 --]
[-- Type: text/x-troff-man, Size: 5851 bytes --]
pmsg [nr queues] [timeout]
Using 1 queues/cpus (2 threads) for 5 seconds.
thread 0: sysvmsg 1507328 type 1 bound to 0001h
thread 1: sysvmsg 1507328 type 0 bound to 0001h
Result matrix:
Thread 0: 492748 1: 492748
Total: 985496
pmsg [nr queues] [timeout]
Using 2 queues/cpus (4 threads) for 5 seconds.
thread 0: sysvmsg 1540096 type 1 bound to 0001h
thread 1: sysvmsg 1572865 type 1 bound to 0002h
thread 3: sysvmsg 1572865 type 0 bound to 0002h
thread 2: sysvmsg 1540096 type 0 bound to 0001h
Result matrix:
Thread 0: 436429 2: 436430
Thread 1: 433866 3: 433866
Total: 1740591
pmsg [nr queues] [timeout]
Using 3 queues/cpus (6 threads) for 5 seconds.
thread 0: sysvmsg 1605632 type 1 bound to 0001h
thread 1: sysvmsg 1638401 type 1 bound to 0002h
thread 2: sysvmsg 1671170 type 1 bound to 0004h
thread 5: sysvmsg 1671170 type 0 bound to 0004h
thread 4: sysvmsg 1638401 type 0 bound to 0002h
thread 3: sysvmsg 1605632 type 0 bound to 0001h
Result matrix:
Thread 0: 388457 3: 388457
Thread 1: 384674 4: 384674
Thread 2: 390603 5: 390603
Total: 2327468
pmsg [nr queues] [timeout]
Using 4 queues/cpus (8 threads) for 5 seconds.
thread 0: sysvmsg 1703936 type 1 bound to 0001h
thread 1: sysvmsg 1736705 type 1 bound to 0002h
thread 2: sysvmsg 1769474 type 1 bound to 0004h
thread 3: sysvmsg 1802243 type 1 bound to 0008h
thread 6: sysvmsg 1769474 type 0 bound to 0004h
thread 5: sysvmsg 1736705 type 0 bound to 0002h
thread 7: sysvmsg 1802243 type 0 bound to 0008h
thread 4: sysvmsg 1703936 type 0 bound to 0001h
Result matrix:
Thread 0: 307122 4: 307122
Thread 1: 308710 5: 308711
Thread 2: 307589 6: 307589
Thread 3: 309472 7: 309472
Total: 2465787
pmsg [nr queues] [timeout]
Using 5 queues/cpus (10 threads) for 5 seconds.
thread 0: sysvmsg 1835008 type 1 bound to 0001h
thread 1: sysvmsg 1867777 type 1 bound to 0002h
thread 2: sysvmsg 1900546 type 1 bound to 0004h
thread 3: sysvmsg 1933315 type 1 bound to 0008h
thread 4: sysvmsg 1966084 type 1 bound to 0010h
thread 7: sysvmsg 1900546 type 0 bound to 0004h
thread 9: sysvmsg 1966084 type 0 bound to 0010h
thread 8: sysvmsg 1933315 type 0 bound to 0008h
thread 6: sysvmsg 1867777 type 0 bound to 0002h
thread 5: sysvmsg 1835008 type 0 bound to 0001h
Result matrix:
Thread 0: 338483 5: 338483
Thread 1: 266569 6: 266569
Thread 2: 285255 7: 285256
Thread 3: 335819 8: 335819
Thread 4: 381711 9: 381712
Total: 3215676
pmsg [nr queues] [timeout]
Using 6 queues/cpus (12 threads) for 5 seconds.
thread 0: sysvmsg 1998848 type 1 bound to 0001h
thread 2: sysvmsg 2064386 type 1 bound to 0004h
thread 7: sysvmsg 2031617 type 0 bound to 0002h
thread 3: sysvmsg 2097155 type 1 bound to 0008h
thread 4: sysvmsg 2129924 type 1 bound to 0010h
thread 5: sysvmsg 2162693 type 1 bound to 0020h
thread 8: sysvmsg 2064386 type 0 bound to 0004h
thread 1: sysvmsg 2031617 type 1 bound to 0002h
thread 6: sysvmsg 1998848 type 0 bound to 0001h
thread 11: sysvmsg 2162693 type 0 bound to 0020h
thread 10: sysvmsg 2129924 type 0 bound to 0010h
thread 9: sysvmsg 2097155 type 0 bound to 0008h
Result matrix:
Thread 0: 318850 6: 318850
Thread 1: 307664 7: 307664
Thread 2: 316157 8: 316157
Thread 3: 274740 9: 274741
Thread 4: 352108 10: 352109
Thread 5: 355040 11: 355041
Total: 3849121
pmsg [nr queues] [timeout]
Using 7 queues/cpus (14 threads) for 5 seconds.
thread 1: sysvmsg 2228225 type 1 bound to 0002h
thread 7: sysvmsg 2195456 type 0 bound to 0001h
thread 2: sysvmsg 2260994 type 1 bound to 0004h
thread 3: sysvmsg 2293763 type 1 bound to 0008h
thread 4: sysvmsg 2326532 type 1 bound to 0010h
thread 5: sysvmsg 2359301 type 1 bound to 0020h
thread 13: sysvmsg 2392070 type 0 bound to 0040h
thread 9: sysvmsg 2260994 type 0 bound to 0004h
thread 8: sysvmsg 2228225 type 0 bound to 0002h
thread 0: sysvmsg 2195456 type 1 bound to 0001h
thread 12: sysvmsg 2359301 type 0 bound to 0020h
thread 6: sysvmsg 2392070 type 1 bound to 0040h
thread 11: sysvmsg 2326532 type 0 bound to 0010h
thread 10: sysvmsg 2293763 type 0 bound to 0008h
Result matrix:
Thread 0: 311401 7: 311402
Thread 1: 255123 8: 255124
Thread 2: 311868 9: 311868
Thread 3: 299011 10: 299011
Thread 4: 280114 11: 280115
Thread 5: 314069 12: 314070
Thread 6: 333345 13: 333345
Total: 4209866
pmsg [nr queues] [timeout]
Using 8 queues/cpus (16 threads) for 5 seconds.
thread 0: sysvmsg 2424832 type 1 bound to 0001h
thread 1: sysvmsg 2457601 type 1 bound to 0002h
thread 2: sysvmsg 2490370 type 1 bound to 0004h
thread 3: sysvmsg 2523139 type 1 bound to 0008h
thread 4: sysvmsg 2555908 type 1 bound to 0010h
thread 5: sysvmsg 2588677 type 1 bound to 0020h
thread 6: sysvmsg 2621446 type 1 bound to 0040h
thread 7: sysvmsg 2654215 type 1 bound to 0080h
thread 10: sysvmsg 2490370 type 0 bound to 0004h
thread 13: sysvmsg 2588677 type 0 bound to 0020h
thread 9: sysvmsg 2457601 type 0 bound to 0002h
thread 12: sysvmsg 2555908 type 0 bound to 0010h
thread 11: sysvmsg 2523139 type 0 bound to 0008h
thread 8: sysvmsg 2424832 type 0 bound to 0001h
thread 14: sysvmsg 2621446 type 0 bound to 0040h
thread 15: sysvmsg 2654215 type 0 bound to 0080h
Result matrix:
Thread 0: 234942 8: 234942
Thread 1: 262977 9: 262978
Thread 2: 294207 10: 294208
Thread 3: 288878 11: 288879
Thread 4: 273532 12: 273532
Thread 5: 248904 13: 248904
Thread 6: 260121 14: 260122
Thread 7: 264229 15: 264230
Total: 4255585
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 00/10] Scalability requirements for sysv ipc - v2
2008-05-05 16:52 ` Nadia Derbey
@ 2008-05-05 17:07 ` Andrew Morton
0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2008-05-05 17:07 UTC (permalink / raw)
To: Nadia Derbey; +Cc: manfred, paulmck, linux-kernel, efault
On Mon, 05 May 2008 18:52:50 +0200 Nadia Derbey <Nadia.Derbey@bull.net> wrote:
> 2.6.25-mm1 (ref) 2.6.25-mm1(patched)
>
> Total (1 cpu): 876000 985496
> Total (2 cpus): 1549592 1740591
> Total (3 cpus): 1694370 2327468
> Total (4 cpus): 404553 2465787
> Total (5 cpus): 391283 3215676
> Total (6 cpus): 263249 3849121
> Total (7 cpus): 191742 4209866
> Total (8 cpus): 141722 4255585
>
Looks encouraging. A comparison with 2.6.22 (or whichever kernel it was
before we broke it) would be interesting too, please.
>
> Full output files attached.
>
> Will send the rest tomorrow.
Nobody reads the raw data, sorry. And it is inefficient to have everyone
independently pore over the raw data extracting the (same) actual
information.
Please condense the result of your performance testing into a
preferably-less-than-one-page summary and include (and maintain) that within
the patch changelog.
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2008-05-05 17:08 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-29 14:33 [PATCH 00/10] Scalability requirements for sysv ipc - v2 Nadia.Derbey
2008-04-29 14:33 ` [PATCH 01/10] Fix idr_remove() Nadia.Derbey
2008-04-29 18:44 ` Andrew Morton
2008-05-05 9:26 ` Nadia Derbey
2008-04-29 14:33 ` [PATCH 02/10] Introduce the ridr structure Nadia.Derbey
2008-05-01 4:30 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 03/10] Introduce ridr_pre_get() Nadia.Derbey
2008-05-01 4:31 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 04/10] Introduce ridr_init() Nadia.Derbey
2008-05-01 4:31 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 05/10] Introduce ridr_get_new_above() Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-05-05 10:33 ` Nadia Derbey
2008-04-29 14:33 ` [PATCH 06/10] Introduce ridr_get_new() Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 07/10] Introduce ridr_find() Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 08/10] Introduce ridr_remove() Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 09/10] Integrate the ridr code into IPC code Nadia.Derbey
2008-05-01 4:32 ` Tim Pepper
2008-04-29 14:33 ` [PATCH 10/10] Get rid of ipc_lock_down() Nadia.Derbey
2008-05-01 4:33 ` Tim Pepper
2008-04-29 18:54 ` [PATCH 00/10] Scalability requirements for sysv ipc - v2 Andrew Morton
2008-05-05 16:52 ` Nadia Derbey
2008-05-05 17:07 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox