public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Nadia.Derbey@bull.net
Cc: efault@gmx.de, manfred@colorfullife.com,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	peterz@infradead.org, xemul@openvz.org
Subject: Re: [PATCH 00/13] Re: Scalability requirements for sysv ipc
Date: Sat, 19 Apr 2008 16:24:13 -0700	[thread overview]
Message-ID: <20080419232413.GD20138@linux.vnet.ibm.com> (raw)
In-Reply-To: <20080411161702.460410000@bull.net>

On Fri, Apr 11, 2008 at 06:17:02PM +0200, Nadia.Derbey@bull.net wrote:
> 
> 
> Here is finally the ipc ridr-based implementation I was talking about last
> week (see http://lkml.org/lkml/2008/4/4/208).
> I couldn't avoid much of the code duplication, but at least made things
> incremental.
> 
> Does somebody now a test suite that exists for the idr API, that I could
> run on this new api?
> 
> Mike, can you try to run it on your victim: I had such a hard time building
> this patch, that I couldn't re-run the test on my 8-core with this new
> version. So the last results I have are for 2.6.25-rc3-mm1.
> 
> Also, I think a careful review should be done to avoid introducing yet other
> problems :-(
> 
> *WARNING*: this patch contains a fix for idr.c
>            I know, I'm doing things bad, but I only saw the problem this
>            afternoon.
> 
> It should be applied on linux-2.6.25-rc8-mm1, in the following order:
> 
> [ PATCH 01/13 ] : copy_idr_code.patch
> [ PATCH 02/13 ] : change_ridr_struct.patch
> [ PATCH 03/13 ] : ridr_pre_get.patch
> [ PATCH 04/13 ] : ridr_alloc_layer.patch
> [ PATCH 05/13 ] : ridr_free_layer.patch
> [ PATCH 06/13 ] : ridr_sub_alloc.patch
> [ PATCH 07/13 ] : ridr_get_empty_slot.patch
> [ PATCH 08/13 ] : ridr_get_new.patch
> [ PATCH 09/13 ] : ridr_remove.patch
> [ PATCH 10/13 ] : ridr_find.patch
> [ PATCH 11/13 ] : ridr_integrate.patch
> [ PATCH 12/13 ] : ipc_use_ridr.patch
> [ PATCH 13/13 ] : remove_ipc_lock_down.patch

Some comments on the resulting ridr.h.

							Thanx, Paul

> /*
>  * include/linux/ridr.h
>  *
>  * Small id to pointer translation service avoiding fixed sized
>  * tables. RCU-based implmentation of IDRs.
>  */
> 
> #ifndef _RIDR_H_
> #define _RIDR_H_
> 
> #include <linux/idr.h>
> #include <linux/rcupdate.h>
> 
> struct ridr_layer {
> 	unsigned long		 bitmap; /* A zero bit means "space here" */
> 	struct ridr_layer	*ary[1<<IDR_BITS];
> 	int			 count;	 /* When zero, we can release it */
> 	struct rcu_head		 rcu_head;
> };

Added an rcu_head for freeing.

> struct ridr {
> 	int		  layers;
> 	gfp_t		  gfp_mask;
> 	struct ridr_layer *top;

The id_free and id_free_cnt fields seem to have migrated to the per-CPU
ridr_pregets variable.  The lock field seems to have been exported to
the caller, hence the per-CPU variables.

Other strategies include:

1.	Passing the lock into the ridr_pre_get() function, allowing
	it to safely update the id_free list.  This gets painful given
	the wide variety of locks, semaphores, mutexes, &c.

2.	Having ridr_pre_get() return a reference to the memory, which
	has the disadvantage of overallocation and needing to repeatedly
	allocate and free.

3.	Have a separate leaf lock guarding the freelist cache.
	This might be a better approach, since each ridr structure
	seems to require a single lock held on updates in any case.

Possible disadvantages of the current per-CPU-variable strategy include
the need to disable preemption (thus hurting real-time latency),
over-allocation on a per-CPU basis, and needlessly "spattering" free
entries across all CPUs in the (unlikely) case where there is lots
of preemption.  And there doesn't appear to be a way to free the
"spattered" structures.

So I suggest a global lock for the id_free list, given that there is
another global lock held when updating the ridr structure in any case.

> };
> 
> #define RIDR_INIT(mask)						\
> {								\
> 	.layers 	= 0,					\
> 	.gfp_mask 	= (mask),				\
> 	.top		= NULL,					\
> }
> #define DEFINE_RIDR(name, mask)	struct ridr name = RIDR_INIT(mask)
> 
> #define INIT_RIDR(name, mask)						\
> do {									\
> 	(name)->layers		= 0;					\
> 	(name)->gfp_mask 	= (mask);				\
> 	(name)->top		= NULL;					\
> } while (0)
> 
> 
> /**
>  * Ridr synchronization (see radix-tree.h)
>  *
>  * ridr_find() is able to be called locklessly, using RCU. The caller must
>  * ensure calls to this function are made within rcu_read_lock() regions.
>  * Other readers (lock-free or otherwise) and modifications may be running
>  * concurrently.
>  *
>  * It is still required that the caller manage the synchronization and
>  * lifetimes of the items. So if RCU lock-free lookups are used, typically
>  * this would mean that the items have their own locks, or are amenable to
>  * lock-free access; and that the items are freed by RCU (or only freed after
>  * having been deleted from the ridr tree *and* a synchronize_rcu() grace
>  * period).
>  */
> 
> /*
>  * This is what we export.
>  */
> 
> void *ridr_find(struct ridr *idp, int id);
> int ridr_pre_get(gfp_t gfp_mask);
> int ridr_get_new(struct ridr *idp, void *ptr, int *id);
> void ridr_remove(struct ridr *idp, int id);
> 
> void __init ridr_init_cache(void);
> 
> static inline void ridr_pre_get_end(void)
> {
> 	preempt_enable();
> }
> 
> #endif /* _RIDR_H_ */

  parent reply	other threads:[~2008-04-19 23:24 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-11 16:17 [PATCH 00/13] Re: Scalability requirements for sysv ipc Nadia.Derbey
2008-04-11 16:17 ` [PATCH 01/13] duplicate idr code Nadia.Derbey
2008-04-11 16:17 ` [PATCH 02/13] Change ridr structure Nadia.Derbey
2008-04-11 16:17 ` [PATCH 03/13] Fix ridr_pre_get() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 04/13] Fix ridr_alloc_layer() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 05/13] Fix free_layer() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 06/13] Fix sub_alloc() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 07/13] Fix get_empty_slot() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 08/13] Fix ridr_get_new_above_int() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 09/13] Fix ridr_remove() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 10/13] Fix ridr_find() Nadia.Derbey
2008-04-11 16:17 ` [PATCH 11/13] Integrate the ridr code Nadia.Derbey
2008-04-11 16:17 ` [PATCH 12/13] Integrate the ridr code into IPC code Nadia.Derbey
2008-04-11 16:17 ` [PATCH 13/13] Get rid of ipc_lock_down() Nadia.Derbey
2008-04-11 16:27 ` [PATCH 00/13] Re: Scalability requirements for sysv ipc Peter Zijlstra
2008-04-14  5:18   ` Nadia Derbey
2008-04-14  7:15     ` Peter Zijlstra
2008-04-14  8:33       ` Nadia Derbey
2008-04-14 10:52         ` Nadia Derbey
2008-04-14 18:54         ` Manfred Spraul
2008-04-15  6:13           ` Nadia Derbey
2008-04-19 23:28         ` Paul E. McKenney
2008-04-21  8:07           ` Nadia Derbey
2008-04-21 14:44             ` Paul E. McKenney
2008-04-14 13:54 ` Mike Galbraith
2008-04-14 15:01   ` Nadia Derbey
2008-04-19 23:24 ` Paul E. McKenney [this message]
2008-04-19 23:25 ` Paul E. McKenney
2008-04-21  5:59   ` Nadia Derbey
2008-04-29 14:35   ` Nadia Derbey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080419232413.GD20138@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=Nadia.Derbey@bull.net \
    --cc=akpm@linux-foundation.org \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=peterz@infradead.org \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox