* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <1152226204.21787.2093.camel@stark>
@ 2006-07-06 23:39 ` Paul E. McKenney
[not found] ` <Pine.LNX.4.44L0.0607071051430.17135-100000@iolanthe.rowland.org>
0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-06 23:39 UTC (permalink / raw)
To: Matt Helsley
Cc: Alan Stern, linux-kernel, Andrew Morton, dipankar, Ingo Molnar,
tytso, Darren Hart, oleg, Jes Sorensen
On Thu, Jul 06, 2006 at 03:50:03PM -0700, Matt Helsley wrote:
> On Thu, 2006-07-06 at 16:28 -0400, Alan Stern wrote:
> > I've been trying to come up with a way to allow SRCU structures to be
> > initialized statically rather than dynamically. The per-cpu data makes it
> > quite hard. Not only do you have to use different routines to access
> > static vs. dynamic per-cpu data, there's just no good way to write a
> > static initializer. This is because the per-cpu data requires its own
> > separate definition, and there's no way to call DEFINE_PER_CPU from within
> > an initializer.
> >
> > Here, in outline, is the best I've been able to come up with. It uses a
> > function pointer member to select the appropriate sort of per-cpu data
> > access. You would use it like this:
> >
> > PREDEFINE_SRCU(s);
> > static DEFINE_SRCU(s);
> > ...
> > idx = srcu_read_lock(&s);
> > ... etc ...
> >
> > Alternative possibilities involve an entire parallel implementation for
> > statically-initialized structures (which seems excessive) or using a
> > runtime test instead of a function pointer to select the dereferencing
> > mechanism.
> >
> > Can anybody suggest anything better?
> >
> > Alan Stern
>
> I started to come up with something similar but did not get as far. I
> suspect the runtime test you're suggesting would look like:
>
> #include <asm/sections.h>
>
> ...
> if ((per_cpu_ptr >= __per_cpu_start) && (per_cpu_ptr < __per_cpu_end)) {
> /* staticly-allocated per-cpu data */
> ...
> } else {
> /* dynamically-allocated per-cpu data */
> ...
> }
> ...
>
> I think that's easier to read and understand than following a function
> pointer.
Is this what the two of you are getting at?
#define DEFINE_SRCU_STRUCT(name) \
DEFINE_PER_CPU(struct srcu_struct_array, name) = { 0, 0 }; \
struct srcu_struct name = { \
.completed = 0, \
.per_cpu_ref = NULL, \
.mutex = __MUTEX_INITIALIZER(name.mutex) \
}
#define srcu_read_lock(ss) \
({ \
if ((ss)->per_cpu_ref != NULL) \
srcu_read_lock_dynamic(&ss); \
else { \
int ret; \
\
preempt_disable(); \
ret = srcu_read_lock_static(&ss, &__get_cpu_var(ss)); \
preempt_enable(); \
ret; \
} \
})
int srcu_read_lock_dynamic(struct srcu_struct *sp)
{
int idx;
preempt_disable();
idx = sp->completed & 0x1;
barrier(); /* ensure compiler looks -once- at sp->completed. */
per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
srcu_barrier(); /* ensure compiler won't misorder critical section. */
preempt_enable();
return idx;
}
int srcu_read_lock_static(struct srcu_struct *sp, srcu_struct_array *cp)
{
int idx;
idx = sp->completed & 0x1;
barrier(); /* ensure compiler looks -once- at sp->completed. */
cp->c[idx]++;
srcu_barrier(); /* ensure compiler won't misorder critical section. */
return idx;
}
And similarly for srcu_read_unlock()?
I sure hope that there is a better way!!! For one thing, you cannot pass
a pointer in to srcu_read_lock(), since __get_cpu_var's name mangling would
fail in that case...
Thanx, Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <Pine.LNX.4.44L0.0607071051430.17135-100000@iolanthe.rowland.org>
@ 2006-07-07 16:33 ` Paul E. McKenney
[not found] ` <Pine.LNX.4.44L0.0607071345270.6793-100000@iolanthe.rowland.org>
0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-07 16:33 UTC (permalink / raw)
To: Alan Stern
Cc: Matt Helsley, linux-kernel, Andrew Morton, dipankar, Ingo Molnar,
tytso, Darren Hart, oleg, Jes Sorensen
On Fri, Jul 07, 2006 at 10:58:28AM -0400, Alan Stern wrote:
> On Thu, 6 Jul 2006, Paul E. McKenney wrote:
>
> > Is this what the two of you are getting at?
> >
> > #define DEFINE_SRCU_STRUCT(name) \
> > DEFINE_PER_CPU(struct srcu_struct_array, name) = { 0, 0 }; \
> > struct srcu_struct name = { \
> > .completed = 0, \
> > .per_cpu_ref = NULL, \
> > .mutex = __MUTEX_INITIALIZER(name.mutex) \
> > }
>
> Note that this approach won't work when you need to do something like:
>
> struct xyz {
> struct srcu_struct s;
> } the_xyz = {
> .s = /* What goes here? */
> };
Yep, this the same issue leading to my complaint below about not being
able to pass a pointer to the resulting srcu_struct.
> > #define srcu_read_lock(ss) \
> > ({ \
> > if ((ss)->per_cpu_ref != NULL) \
> > srcu_read_lock_dynamic(&ss); \
> > else { \
> > int ret; \
> > \
> > preempt_disable(); \
> > ret = srcu_read_lock_static(&ss, &__get_cpu_var(ss)); \
> > preempt_enable(); \
> > ret; \
> > } \
> > })
> >
> > int srcu_read_lock_dynamic(struct srcu_struct *sp)
> > {
> > int idx;
> >
> > preempt_disable();
> > idx = sp->completed & 0x1;
> > barrier(); /* ensure compiler looks -once- at sp->completed. */
> > per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
> > srcu_barrier(); /* ensure compiler won't misorder critical section. */
> > preempt_enable();
> > return idx;
> > }
> >
> > int srcu_read_lock_static(struct srcu_struct *sp, srcu_struct_array *cp)
> > {
> > int idx;
> >
> > idx = sp->completed & 0x1;
> > barrier(); /* ensure compiler looks -once- at sp->completed. */
> > cp->c[idx]++;
> > srcu_barrier(); /* ensure compiler won't misorder critical section. */
> > return idx;
> > }
> >
> > And similarly for srcu_read_unlock()?
> >
> > I sure hope that there is a better way!!! For one thing, you cannot pass
> > a pointer in to srcu_read_lock(), since __get_cpu_var's name mangling would
> > fail in that case...
>
> No, that's not what we had in mind.
Another approach I looked at was statically allocating a struct
percpu_data, but initializing it seems to be problematic.
So here are the three approaches that seem to have some chance
of working:
1. Your approach of dynamically selecting between the
per_cpu_ptr() and per_cpu() APIs based on a flag
within the structure.
2. Creating a pair of SRCU APIs, reflecting the two
underlying per-CPU APIs (one for staticly allocated
per-CPU variables, the other for dynamically allocated
per-CPU variables).
3. A compile-time translation layer, making use of
two different structure types and a bit of gcc
type comparison. The idea would be to create
a srcu_struct_static and a srcu_struct_dynamic
structure that contained a pointer to the corresponding
per-CPU variable and an srcu_struct, and to have
a set of macros that did a typeof comparison, selecting
the appropriate underlying primitive from the set
of two.
This is essentially #2, but with some cpp/typeof
magic to make it look to the user of SRCU that there
is but one API.
The goal I believe we are trying to attain with SRCU include:
a. Minimal read-side overhead. This goal favors 2 and 3.
(Yes, blocking is so expensive that the extra check is
"in the noise" if we block on the read side -- but I
expect uses where blocking can happen but is extremely
rare.)
b. Minimal API expansion. This goal favors 1 and 3.
c. Simple and straightforward use of well-understood and
timeworn features of gcc. This goal favors 1 and 2.
Based on this breakdown, we have a three-way tie. I tend to pay less
much attention to (c), which would lead me to choose #2.
Thoughts? Other important goals? Better yet, other approaches?
Thanx, Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <Pine.LNX.4.44L0.0607071345270.6793-100000@iolanthe.rowland.org>
@ 2006-07-07 18:59 ` Paul E. McKenney
2006-07-07 19:59 ` Alan Stern
0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-07 18:59 UTC (permalink / raw)
To: Alan Stern
Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, linux-kernel
On Fri, Jul 07, 2006 at 02:02:27PM -0400, Alan Stern wrote:
> On Fri, 7 Jul 2006, Paul E. McKenney wrote:
>
> > > Note that this approach won't work when you need to do something like:
> > >
> > > struct xyz {
> > > struct srcu_struct s;
> > > } the_xyz = {
> > > .s = /* What goes here? */
> > > };
> >
> > Yep, this the same issue leading to my complaint below about not being
> > able to pass a pointer to the resulting srcu_struct.
>
> No, not really. The problem here is that you can't use DEFINE_PER_CPU
> inside the initializer for the_xyz. The problem about not being able to
> pass a pointer is easily fixed; this problem is not so easy.
Both symptoms of the same problem in my view, but I agree that other
perspectives are possible and perhaps even useful. ;-)
We agree on the important thing, which is that the approach I was
calling out in the earlier email has some severe shortcomings, and
that we therefore need to do something different.
> > Another approach I looked at was statically allocating a struct
> > percpu_data, but initializing it seems to be problematic.
> >
> > So here are the three approaches that seem to have some chance
> > of working:
> >
> > 1. Your approach of dynamically selecting between the
> > per_cpu_ptr() and per_cpu() APIs based on a flag
> > within the structure.
>
> Or a function pointer within the structure.
Agreed, either a function pointer or a flag.
> > 2. Creating a pair of SRCU APIs, reflecting the two
> > underlying per-CPU APIs (one for staticly allocated
> > per-CPU variables, the other for dynamically allocated
> > per-CPU variables).
>
> This seems ridiculous. It would be much better IMO to come up with a
> least-common-multiple API that would apply to both sorts of variables.
> For example, per-cpu data could be represented by _both_ a pointer and a
> table instead of just a pointer (static) or just a table (dynamic).
No argument here.
> > 3. A compile-time translation layer, making use of
> > two different structure types and a bit of gcc
> > type comparison. The idea would be to create
> > a srcu_struct_static and a srcu_struct_dynamic
> > structure that contained a pointer to the corresponding
> > per-CPU variable and an srcu_struct, and to have
> > a set of macros that did a typeof comparison, selecting
> > the appropriate underlying primitive from the set
> > of two.
> >
> > This is essentially #2, but with some cpp/typeof
> > magic to make it look to the user of SRCU that there
> > is but one API.
>
> This would add tremendous complexity, in terms of how the API is
> implemented, for no very good reason. Programming is hard enough
> already...
Leaving out the "tremendous", yes, there would be some machinations.
It would certainly be OK by me if this can be avoided. ;-)
> > The goal I believe we are trying to attain with SRCU include:
> >
> > a. Minimal read-side overhead. This goal favors 2 and 3.
> > (Yes, blocking is so expensive that the extra check is
> > "in the noise" if we block on the read side -- but I
> > expect uses where blocking can happen but is extremely
> > rare.)
> >
> > b. Minimal API expansion. This goal favors 1 and 3.
> >
> > c. Simple and straightforward use of well-understood and
> > timeworn features of gcc. This goal favors 1 and 2.
> >
> > Based on this breakdown, we have a three-way tie. I tend to pay less
> > much attention to (c), which would lead me to choose #2.
> >
> > Thoughts? Other important goals? Better yet, other approaches?
>
> I think it's foolish for us to waste a tremendous amount of time on this
> when the real problem is the poor design of the per-cpu API. Fix that,
> and most of the difficulties will be gone.
If the per-CPU API was reasonably unifiable, I expect that it would
already be unified. The problem is that the easy ways to unify it hit
some extremely hot code paths with extra cache misses -- for example, one
could add a struct percpu_data to each and every static DEFINE_PERCPU(),
but at the cost of an extra cache line touched and extra indirection
-- which I believe was deemed unacceptable -- and would introduce
initialization difficulties for the static case.
So, a fourth possibility -- can a call from start_kernel() invoke some
function in yours and Matt's code invoke init_srcu_struct() to get a
statically allocated srcu_struct initialized? Or, if this is part of
a module, can the module initialization function do this work?
(Hey, I had to ask!)
Thanx, Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
2006-07-07 18:59 ` Paul E. McKenney
@ 2006-07-07 19:59 ` Alan Stern
2006-07-07 21:11 ` Matt Helsley
0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2006-07-07 19:59 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, linux-kernel
On Fri, 7 Jul 2006, Paul E. McKenney wrote:
> > I think it's foolish for us to waste a tremendous amount of time on this
> > when the real problem is the poor design of the per-cpu API. Fix that,
> > and most of the difficulties will be gone.
>
> If the per-CPU API was reasonably unifiable, I expect that it would
> already be unified. The problem is that the easy ways to unify it hit
> some extremely hot code paths with extra cache misses -- for example, one
> could add a struct percpu_data to each and every static DEFINE_PERCPU(),
> but at the cost of an extra cache line touched and extra indirection
> -- which I believe was deemed unacceptable -- and would introduce
> initialization difficulties for the static case.
Here's a sketch of a possible approach. Generalizing it and making it
look pretty are left as exercises for the reader. :-)
In srcu_struct, along with
struct srcu_struct_array *per_cpu_ref;
add
struct percpu_data *per_cpu_table;
Dynamic initialization does:
sp->per_cpu_ref = NULL;
sp->per_cpu_table = alloc_percpu(...);
Static initialization does:
sp->per_cpu_ref = PER_CPU_ADDRESS(...); /* My macro
from before; gives the address of the static variable */
sp->per_cpu_table = (struct percpu_data *)
~(unsigned long) __per_cpu_offset;
Then the unified_per_cpu_ptr(ref, table, cpu) macro would expand to
something like this:
({
struct percpu_data *t = (struct percpu_data *)~(unsigned long)(table);
RELOC_HIDE(ref, t->ptrs[cpu]);
})
Making this work right would of course require knowledge of the intimate
details of both include/linux/percpu.h and include/asmXXX/percpu.h.
There's some ambiguity about what t above points to: a structure
containing an array of pointers to void, or an array of unsigned longs.
Fortunately I think it doesn't matter.
Doing it this way would not incur any extra cache misses, except for the
need to store an extra member in srcu_struct.
> So, a fourth possibility -- can a call from start_kernel() invoke some
> function in yours and Matt's code invoke init_srcu_struct() to get a
> statically allocated srcu_struct initialized? Or, if this is part of
> a module, can the module initialization function do this work?
>
> (Hey, I had to ask!)
That is certainly a viable approach: just force everyone to use dynamic
initialization. Changes to existing code would be relatively few.
I'm not sure where the right place would be to add these initialization
calls. After kmalloc is working but before the relevant notifier chains
get used at all. Is there such a place? I guess it depends on which
notifier chains we convert.
We might want to leave some chains using the existing rw-semaphore API.
It's more appropriate when there's a high frequency of write-locking
(i.e., things registering or unregistering on the notifier chain). The
SRCU approach is more appropriate when the chain is called a lot and
needs to have low overhead, but (un)registration is uncommon. Matt's task
notifiers are a good example.
Alan Stern
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
2006-07-07 19:59 ` Alan Stern
@ 2006-07-07 21:11 ` Matt Helsley
2006-07-07 21:47 ` Paul E. McKenney
2006-07-10 19:11 ` SRCU-based notifier chains Alan Stern
0 siblings, 2 replies; 13+ messages in thread
From: Matt Helsley @ 2006-07-07 21:11 UTC (permalink / raw)
To: Alan Stern
Cc: Paul E. McKenney, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, LKML
On Fri, 2006-07-07 at 15:59 -0400, Alan Stern wrote:
> On Fri, 7 Jul 2006, Paul E. McKenney wrote:
<snip>
> > So, a fourth possibility -- can a call from start_kernel() invoke some
> > function in yours and Matt's code invoke init_srcu_struct() to get a
> > statically allocated srcu_struct initialized? Or, if this is part of
> > a module, can the module initialization function do this work?
> >
> > (Hey, I had to ask!)
>
> That is certainly a viable approach: just force everyone to use dynamic
> initialization. Changes to existing code would be relatively few.
Works for me. I've been working on patches for Andrew's multi-chain
proposal and I could use an init function there anyway. Should be faster
too -- dynamically-allocated per-cpu memory can take advantage of
node-local memory whereas, to my knowledge, statically-allocated cannot.
> I'm not sure where the right place would be to add these initialization
> calls. After kmalloc is working but before the relevant notifier chains
> get used at all. Is there such a place? I guess it depends on which
> notifier chains we convert.
>
> We might want to leave some chains using the existing rw-semaphore API.
> It's more appropriate when there's a high frequency of write-locking
> (i.e., things registering or unregistering on the notifier chain). The
> SRCU approach is more appropriate when the chain is called a lot and
> needs to have low overhead, but (un)registration is uncommon. Matt's task
> notifiers are a good example.
Yes, it is an excellent example.
> Alan Stern
Cheers,
-Matt Helsley
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
2006-07-07 21:11 ` Matt Helsley
@ 2006-07-07 21:47 ` Paul E. McKenney
2006-07-10 19:11 ` SRCU-based notifier chains Alan Stern
1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-07 21:47 UTC (permalink / raw)
To: Matt Helsley
Cc: Alan Stern, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, LKML
On Fri, Jul 07, 2006 at 02:11:26PM -0700, Matt Helsley wrote:
> On Fri, 2006-07-07 at 15:59 -0400, Alan Stern wrote:
> > On Fri, 7 Jul 2006, Paul E. McKenney wrote:
>
> <snip>
>
> > > So, a fourth possibility -- can a call from start_kernel() invoke some
> > > function in yours and Matt's code invoke init_srcu_struct() to get a
> > > statically allocated srcu_struct initialized? Or, if this is part of
> > > a module, can the module initialization function do this work?
> > >
> > > (Hey, I had to ask!)
> >
> > That is certainly a viable approach: just force everyone to use dynamic
> > initialization. Changes to existing code would be relatively few.
>
> Works for me. I've been working on patches for Andrew's multi-chain
> proposal and I could use an init function there anyway. Should be faster
> too -- dynamically-allocated per-cpu memory can take advantage of
> node-local memory whereas, to my knowledge, statically-allocated cannot.
Sounds very good to me! ;-)
> > I'm not sure where the right place would be to add these initialization
> > calls. After kmalloc is working but before the relevant notifier chains
> > get used at all. Is there such a place? I guess it depends on which
> > notifier chains we convert.
> >
> > We might want to leave some chains using the existing rw-semaphore API.
> > It's more appropriate when there's a high frequency of write-locking
> > (i.e., things registering or unregistering on the notifier chain). The
> > SRCU approach is more appropriate when the chain is called a lot and
> > needs to have low overhead, but (un)registration is uncommon. Matt's task
> > notifiers are a good example.
>
> Yes, it is an excellent example.
Good!!! Please let me know how it goes. I will shelve the idea of
statically allocated per-CPU data for srcu_struct for the moment.
If some other application shows up that needs it, I will revisit.
Thanx, Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* SRCU-based notifier chains
2006-07-07 21:11 ` Matt Helsley
2006-07-07 21:47 ` Paul E. McKenney
@ 2006-07-10 19:11 ` Alan Stern
2006-07-11 17:39 ` Paul E. McKenney
1 sibling, 1 reply; 13+ messages in thread
From: Alan Stern @ 2006-07-10 19:11 UTC (permalink / raw)
To: Matt Helsley
Cc: Paul E. McKenney, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, LKML
On Fri, 7 Jul 2006, Matt Helsley wrote:
> On Fri, 2006-07-07 at 15:59 -0400, Alan Stern wrote:
> > On Fri, 7 Jul 2006, Paul E. McKenney wrote:
>
> <snip>
>
> > > So, a fourth possibility -- can a call from start_kernel() invoke some
> > > function in yours and Matt's code invoke init_srcu_struct() to get a
> > > statically allocated srcu_struct initialized? Or, if this is part of
> > > a module, can the module initialization function do this work?
> > >
> > > (Hey, I had to ask!)
> >
> > That is certainly a viable approach: just force everyone to use dynamic
> > initialization. Changes to existing code would be relatively few.
>
> Works for me. I've been working on patches for Andrew's multi-chain
> proposal and I could use an init function there anyway. Should be faster
> too -- dynamically-allocated per-cpu memory can take advantage of
> node-local memory whereas, to my knowledge, statically-allocated cannot.
>
> > I'm not sure where the right place would be to add these initialization
> > calls. After kmalloc is working but before the relevant notifier chains
> > get used at all. Is there such a place? I guess it depends on which
> > notifier chains we convert.
> >
> > We might want to leave some chains using the existing rw-semaphore API.
> > It's more appropriate when there's a high frequency of write-locking
> > (i.e., things registering or unregistering on the notifier chain). The
> > SRCU approach is more appropriate when the chain is called a lot and
> > needs to have low overhead, but (un)registration is uncommon. Matt's task
> > notifiers are a good example.
>
> Yes, it is an excellent example.
Okay, here is a patch -- completely untested but it compiles -- that adds
a new kind of notifier head, using SRCU to manage the list consistency.
At the moment I don't have any good candidates for blocking notifier
chains that should be converted to SRCU notifier chains, although some of
the things in the neworking core probably qualify.
Anyway, you can try this out with your task notifiers to make sure it
works as desired.
Alan Stern
P.S.: For this to work, the patch had to add "#ifndef _LINUX_SRCU_H"
guards to include/linux/srcu.h. They undoubtedly belong there regardless.
Index: usb-2.6/kernel/sys.c
===================================================================
--- usb-2.6.orig/kernel/sys.c
+++ usb-2.6/kernel/sys.c
@@ -151,7 +151,7 @@ static int __kprobes notifier_call_chain
/*
* Atomic notifier chain routines. Registration and unregistration
- * use a mutex, and call_chain is synchronized by RCU (no locks).
+ * use a spinlock, and call_chain is synchronized by RCU (no locks).
*/
/**
@@ -399,6 +399,128 @@ int raw_notifier_call_chain(struct raw_n
EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
+/*
+ * SRCU notifier chain routines. Registration and unregistration
+ * use a mutex, and call_chain is synchronized by SRCU (no locks).
+ */
+
+/**
+ * srcu_notifier_chain_register - Add notifier to an SRCU notifier chain
+ * @nh: Pointer to head of the SRCU notifier chain
+ * @n: New entry in notifier chain
+ *
+ * Adds a notifier to an SRCU notifier chain.
+ * Must be called in process context.
+ *
+ * Currently always returns zero.
+ */
+
+int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
+ struct notifier_block *n)
+{
+ int ret;
+
+ /*
+ * This code gets used during boot-up, when task switching is
+ * not yet working and interrupts must remain disabled. At
+ * such times we must not call mutex_lock().
+ */
+ if (unlikely(system_state == SYSTEM_BOOTING))
+ return notifier_chain_register(&nh->head, n);
+
+ mutex_lock(&nh->mutex);
+ ret = notifier_chain_register(&nh->head, n);
+ mutex_unlock(&nh->mutex);
+ return ret;
+}
+
+EXPORT_SYMBOL_GPL(srcu_notifier_chain_register);
+
+/**
+ * srcu_notifier_chain_unregister - Remove notifier from an SRCU notifier chain
+ * @nh: Pointer to head of the SRCU notifier chain
+ * @n: Entry to remove from notifier chain
+ *
+ * Removes a notifier from an SRCU notifier chain.
+ * Must be called from process context.
+ *
+ * Returns zero on success or %-ENOENT on failure.
+ */
+int srcu_notifier_chain_unregister(struct srcu_notifier_head *nh,
+ struct notifier_block *n)
+{
+ int ret;
+
+ /*
+ * This code gets used during boot-up, when task switching is
+ * not yet working and interrupts must remain disabled. At
+ * such times we must not call mutex_lock().
+ */
+ if (unlikely(system_state == SYSTEM_BOOTING))
+ return notifier_chain_unregister(&nh->head, n);
+
+ mutex_lock(&nh->mutex);
+ ret = notifier_chain_unregister(&nh->head, n);
+ mutex_unlock(&nh->mutex);
+ synchronize_srcu(&nh->srcu);
+ return ret;
+}
+
+EXPORT_SYMBOL_GPL(srcu_notifier_chain_unregister);
+
+/**
+ * srcu_notifier_call_chain - Call functions in an SRCU notifier chain
+ * @nh: Pointer to head of the SRCU notifier chain
+ * @val: Value passed unmodified to notifier function
+ * @v: Pointer passed unmodified to notifier function
+ *
+ * Calls each function in a notifier chain in turn. The functions
+ * run in a process context, so they are allowed to block.
+ *
+ * If the return value of the notifier can be and'ed
+ * with %NOTIFY_STOP_MASK then srcu_notifier_call_chain
+ * will return immediately, with the return value of
+ * the notifier function which halted execution.
+ * Otherwise the return value is the return value
+ * of the last notifier function called.
+ */
+
+int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
+ unsigned long val, void *v)
+{
+ int ret;
+ int idx;
+
+ idx = srcu_read_lock(&nh->srcu);
+ ret = notifier_call_chain(&nh->head, val, v);
+ srcu_read_unlock(&nh->srcu, idx);
+ return ret;
+}
+
+EXPORT_SYMBOL_GPL(srcu_notifier_call_chain);
+
+/**
+ * srcu_init_notifier_head - Initialize an SRCU notifier head
+ * @nh: Pointer to head of the srcu notifier chain
+ *
+ * Unlike other sorts of notifier heads, SRCU notifier heads require
+ * dynamic initialization. Be sure to call this routine before
+ * calling any of the other SRCU notifier routines for this head.
+ *
+ * If an SRCU notifier head is deallocated, it must first be cleaned
+ * up by calling srcu_cleanup_notifier_head(). Otherwise the head's
+ * per-cpu data (used by the SRCU mechanism) will leak.
+ */
+
+void srcu_init_notifier_head(struct srcu_notifier_head *nh)
+{
+ mutex_init(&nh->mutex);
+ init_srcu_struct(&nh->srcu);
+ nh->head = NULL;
+}
+
+EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
+
/**
* register_reboot_notifier - Register function to be called at reboot time
* @nb: Info about notifier function to be called
Index: usb-2.6/include/linux/notifier.h
===================================================================
--- usb-2.6.orig/include/linux/notifier.h
+++ usb-2.6/include/linux/notifier.h
@@ -12,9 +12,10 @@
#include <linux/errno.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
+#include <linux/srcu.h>
/*
- * Notifier chains are of three types:
+ * Notifier chains are of four types:
*
* Atomic notifier chains: Chain callbacks run in interrupt/atomic
* context. Callouts are not allowed to block.
@@ -23,13 +24,27 @@
* Raw notifier chains: There are no restrictions on callbacks,
* registration, or unregistration. All locking and protection
* must be provided by the caller.
+ * SRCU notifier chains: A variant of blocking notifier chains, with
+ * the same restrictions.
*
* atomic_notifier_chain_register() may be called from an atomic context,
- * but blocking_notifier_chain_register() must be called from a process
- * context. Ditto for the corresponding _unregister() routines.
+ * but blocking_notifier_chain_register() and srcu_notifier_chain_register()
+ * must be called from a process context. Ditto for the corresponding
+ * _unregister() routines.
*
- * atomic_notifier_chain_unregister() and blocking_notifier_chain_unregister()
- * _must not_ be called from within the call chain.
+ * atomic_notifier_chain_unregister(), blocking_notifier_chain_unregister(),
+ * and srcu_notifier_chain_unregister() _must not_ be called from within
+ * the call chain.
+ *
+ * SRCU notifier chains are an alternative form of blocking notifier chains.
+ * They use SRCU (Sleepable Read-Copy Update) instead of rw-semaphores for
+ * protection of the chain links. This means there is _very_ low overhead
+ * in srcu_notifier_call_chain(): no cache misses and no memory barriers.
+ * As compensation, srcu_notifier_chain_unregister() is rather expensive.
+ * SRCU notifier chains should be used when the chain will be called very
+ * often but notifier_blocks will seldom be removed. Also, SRCU notifier
+ * chains are slightly more difficult to use because they require dynamic
+ * runtime initialization.
*/
struct notifier_block {
@@ -52,6 +67,12 @@ struct raw_notifier_head {
struct notifier_block *head;
};
+struct srcu_notifier_head {
+ struct mutex mutex;
+ struct srcu_struct srcu;
+ struct notifier_block *head;
+};
+
#define ATOMIC_INIT_NOTIFIER_HEAD(name) do { \
spin_lock_init(&(name)->lock); \
(name)->head = NULL; \
@@ -64,6 +85,11 @@ struct raw_notifier_head {
(name)->head = NULL; \
} while (0)
+/* srcu_notifier_heads must be initialized and cleaned up dynamically */
+extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
+#define srcu_cleanup_notifier_head(name) \
+ cleanup_srcu_struct(&(name)->srcu);
+
#define ATOMIC_NOTIFIER_INIT(name) { \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.head = NULL }
@@ -72,6 +98,7 @@ struct raw_notifier_head {
.head = NULL }
#define RAW_NOTIFIER_INIT(name) { \
.head = NULL }
+/* srcu_notifier_heads cannot be initialized statically */
#define ATOMIC_NOTIFIER_HEAD(name) \
struct atomic_notifier_head name = \
@@ -91,6 +118,8 @@ extern int blocking_notifier_chain_regis
struct notifier_block *);
extern int raw_notifier_chain_register(struct raw_notifier_head *,
struct notifier_block *);
+extern int srcu_notifier_chain_register(struct srcu_notifier_head *,
+ struct notifier_block *);
extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *,
struct notifier_block *);
@@ -98,6 +127,8 @@ extern int blocking_notifier_chain_unreg
struct notifier_block *);
extern int raw_notifier_chain_unregister(struct raw_notifier_head *,
struct notifier_block *);
+extern int srcu_notifier_chain_unregister(struct srcu_notifier_head *,
+ struct notifier_block *);
extern int atomic_notifier_call_chain(struct atomic_notifier_head *,
unsigned long val, void *v);
@@ -105,6 +136,8 @@ extern int blocking_notifier_call_chain(
unsigned long val, void *v);
extern int raw_notifier_call_chain(struct raw_notifier_head *,
unsigned long val, void *v);
+extern int srcu_notifier_call_chain(struct srcu_notifier_head *,
+ unsigned long val, void *v);
#define NOTIFY_DONE 0x0000 /* Don't care */
#define NOTIFY_OK 0x0001 /* Suits me */
Index: usb-2.6/include/linux/srcu.h
===================================================================
--- usb-2.6.orig/include/linux/srcu.h
+++ usb-2.6/include/linux/srcu.h
@@ -24,6 +24,9 @@
*
*/
+#ifndef _LINUX_SRCU_H
+#define _LINUX_SRCU_H
+
struct srcu_struct_array {
int c[2];
};
@@ -47,3 +50,5 @@ void srcu_read_unlock(struct srcu_struct
void synchronize_srcu(struct srcu_struct *sp);
long srcu_batches_completed(struct srcu_struct *sp);
void cleanup_srcu_struct(struct srcu_struct *sp);
+
+#endif
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: SRCU-based notifier chains
2006-07-10 19:11 ` SRCU-based notifier chains Alan Stern
@ 2006-07-11 17:39 ` Paul E. McKenney
2006-07-11 18:03 ` Alan Stern
2006-07-11 18:18 ` [PATCH] Add " Alan Stern
0 siblings, 2 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-11 17:39 UTC (permalink / raw)
To: Alan Stern
Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, LKML
On Mon, Jul 10, 2006 at 03:11:31PM -0400, Alan Stern wrote:
> On Fri, 7 Jul 2006, Matt Helsley wrote:
> > On Fri, 2006-07-07 at 15:59 -0400, Alan Stern wrote:
[ . . . ]
> > > We might want to leave some chains using the existing rw-semaphore API.
> > > It's more appropriate when there's a high frequency of write-locking
> > > (i.e., things registering or unregistering on the notifier chain). The
> > > SRCU approach is more appropriate when the chain is called a lot and
> > > needs to have low overhead, but (un)registration is uncommon. Matt's task
> > > notifiers are a good example.
> >
> > Yes, it is an excellent example.
>
> Okay, here is a patch -- completely untested but it compiles -- that adds
> a new kind of notifier head, using SRCU to manage the list consistency.
>
> At the moment I don't have any good candidates for blocking notifier
> chains that should be converted to SRCU notifier chains, although some of
> the things in the neworking core probably qualify.
>
> Anyway, you can try this out with your task notifiers to make sure it
> works as desired.
>
> Alan Stern
>
> P.S.: For this to work, the patch had to add "#ifndef _LINUX_SRCU_H"
> guards to include/linux/srcu.h. They undoubtedly belong there regardless.
Looks sane to me. A couple of minor comments interspersed.
Thanx, Paul
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
> Index: usb-2.6/kernel/sys.c
> ===================================================================
> --- usb-2.6.orig/kernel/sys.c
> +++ usb-2.6/kernel/sys.c
> @@ -151,7 +151,7 @@ static int __kprobes notifier_call_chain
>
> /*
> * Atomic notifier chain routines. Registration and unregistration
> - * use a mutex, and call_chain is synchronized by RCU (no locks).
> + * use a spinlock, and call_chain is synchronized by RCU (no locks).
> */
>
> /**
> @@ -399,6 +399,128 @@ int raw_notifier_call_chain(struct raw_n
>
> EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
>
> +/*
> + * SRCU notifier chain routines. Registration and unregistration
> + * use a mutex, and call_chain is synchronized by SRCU (no locks).
> + */
Hmmm... Probably my just failing to pay attention, but haven't noticed
the double-header-comment style before.
> +/**
> + * srcu_notifier_chain_register - Add notifier to an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @n: New entry in notifier chain
> + *
> + * Adds a notifier to an SRCU notifier chain.
> + * Must be called in process context.
> + *
> + * Currently always returns zero.
> + */
> +
> +int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
> + struct notifier_block *n)
> +{
> + int ret;
> +
> + /*
> + * This code gets used during boot-up, when task switching is
> + * not yet working and interrupts must remain disabled. At
> + * such times we must not call mutex_lock().
> + */
> + if (unlikely(system_state == SYSTEM_BOOTING))
> + return notifier_chain_register(&nh->head, n);
> +
> + mutex_lock(&nh->mutex);
> + ret = notifier_chain_register(&nh->head, n);
> + mutex_unlock(&nh->mutex);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_chain_register);
> +
> +/**
> + * srcu_notifier_chain_unregister - Remove notifier from an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @n: Entry to remove from notifier chain
> + *
> + * Removes a notifier from an SRCU notifier chain.
> + * Must be called from process context.
> + *
> + * Returns zero on success or %-ENOENT on failure.
> + */
> +int srcu_notifier_chain_unregister(struct srcu_notifier_head *nh,
> + struct notifier_block *n)
> +{
> + int ret;
> +
> + /*
> + * This code gets used during boot-up, when task switching is
> + * not yet working and interrupts must remain disabled. At
> + * such times we must not call mutex_lock().
> + */
> + if (unlikely(system_state == SYSTEM_BOOTING))
> + return notifier_chain_unregister(&nh->head, n);
> +
> + mutex_lock(&nh->mutex);
> + ret = notifier_chain_unregister(&nh->head, n);
> + mutex_unlock(&nh->mutex);
> + synchronize_srcu(&nh->srcu);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_chain_unregister);
> +
> +/**
> + * srcu_notifier_call_chain - Call functions in an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @val: Value passed unmodified to notifier function
> + * @v: Pointer passed unmodified to notifier function
> + *
> + * Calls each function in a notifier chain in turn. The functions
> + * run in a process context, so they are allowed to block.
> + *
> + * If the return value of the notifier can be and'ed
> + * with %NOTIFY_STOP_MASK then srcu_notifier_call_chain
> + * will return immediately, with the return value of
> + * the notifier function which halted execution.
> + * Otherwise the return value is the return value
> + * of the last notifier function called.
> + */
> +
> +int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
> + unsigned long val, void *v)
> +{
> + int ret;
> + int idx;
> +
> + idx = srcu_read_lock(&nh->srcu);
> + ret = notifier_call_chain(&nh->head, val, v);
> + srcu_read_unlock(&nh->srcu, idx);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_call_chain);
> +
> +/**
> + * srcu_init_notifier_head - Initialize an SRCU notifier head
> + * @nh: Pointer to head of the srcu notifier chain
> + *
> + * Unlike other sorts of notifier heads, SRCU notifier heads require
> + * dynamic initialization. Be sure to call this routine before
> + * calling any of the other SRCU notifier routines for this head.
> + *
> + * If an SRCU notifier head is deallocated, it must first be cleaned
> + * up by calling srcu_cleanup_notifier_head(). Otherwise the head's
> + * per-cpu data (used by the SRCU mechanism) will leak.
> + */
> +
> +void srcu_init_notifier_head(struct srcu_notifier_head *nh)
> +{
> + mutex_init(&nh->mutex);
> + init_srcu_struct(&nh->srcu);
> + nh->head = NULL;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
> +
> /**
> * register_reboot_notifier - Register function to be called at reboot time
> * @nb: Info about notifier function to be called
> Index: usb-2.6/include/linux/notifier.h
> ===================================================================
> --- usb-2.6.orig/include/linux/notifier.h
> +++ usb-2.6/include/linux/notifier.h
> @@ -12,9 +12,10 @@
> #include <linux/errno.h>
> #include <linux/mutex.h>
> #include <linux/rwsem.h>
> +#include <linux/srcu.h>
>
> /*
> - * Notifier chains are of three types:
> + * Notifier chains are of four types:
Is it possible to subsume one of the other three types?
Might not be, but have to ask...
> *
> * Atomic notifier chains: Chain callbacks run in interrupt/atomic
> * context. Callouts are not allowed to block.
> @@ -23,13 +24,27 @@
> * Raw notifier chains: There are no restrictions on callbacks,
> * registration, or unregistration. All locking and protection
> * must be provided by the caller.
> + * SRCU notifier chains: A variant of blocking notifier chains, with
> + * the same restrictions.
> *
> * atomic_notifier_chain_register() may be called from an atomic context,
> - * but blocking_notifier_chain_register() must be called from a process
> - * context. Ditto for the corresponding _unregister() routines.
> + * but blocking_notifier_chain_register() and srcu_notifier_chain_register()
> + * must be called from a process context. Ditto for the corresponding
> + * _unregister() routines.
> *
> - * atomic_notifier_chain_unregister() and blocking_notifier_chain_unregister()
> - * _must not_ be called from within the call chain.
> + * atomic_notifier_chain_unregister(), blocking_notifier_chain_unregister(),
> + * and srcu_notifier_chain_unregister() _must not_ be called from within
> + * the call chain.
> + *
> + * SRCU notifier chains are an alternative form of blocking notifier chains.
> + * They use SRCU (Sleepable Read-Copy Update) instead of rw-semaphores for
> + * protection of the chain links. This means there is _very_ low overhead
> + * in srcu_notifier_call_chain(): no cache misses and no memory barriers.
> + * As compensation, srcu_notifier_chain_unregister() is rather expensive.
> + * SRCU notifier chains should be used when the chain will be called very
> + * often but notifier_blocks will seldom be removed. Also, SRCU notifier
> + * chains are slightly more difficult to use because they require dynamic
> + * runtime initialization.
> */
>
> struct notifier_block {
> @@ -52,6 +67,12 @@ struct raw_notifier_head {
> struct notifier_block *head;
> };
>
> +struct srcu_notifier_head {
> + struct mutex mutex;
> + struct srcu_struct srcu;
> + struct notifier_block *head;
> +};
> +
> #define ATOMIC_INIT_NOTIFIER_HEAD(name) do { \
> spin_lock_init(&(name)->lock); \
> (name)->head = NULL; \
> @@ -64,6 +85,11 @@ struct raw_notifier_head {
> (name)->head = NULL; \
> } while (0)
>
> +/* srcu_notifier_heads must be initialized and cleaned up dynamically */
> +extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
> +#define srcu_cleanup_notifier_head(name) \
> + cleanup_srcu_struct(&(name)->srcu);
> +
> #define ATOMIC_NOTIFIER_INIT(name) { \
> .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
> .head = NULL }
> @@ -72,6 +98,7 @@ struct raw_notifier_head {
> .head = NULL }
> #define RAW_NOTIFIER_INIT(name) { \
> .head = NULL }
> +/* srcu_notifier_heads cannot be initialized statically */
>
> #define ATOMIC_NOTIFIER_HEAD(name) \
> struct atomic_notifier_head name = \
> @@ -91,6 +118,8 @@ extern int blocking_notifier_chain_regis
> struct notifier_block *);
> extern int raw_notifier_chain_register(struct raw_notifier_head *,
> struct notifier_block *);
> +extern int srcu_notifier_chain_register(struct srcu_notifier_head *,
> + struct notifier_block *);
>
> extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *,
> struct notifier_block *);
> @@ -98,6 +127,8 @@ extern int blocking_notifier_chain_unreg
> struct notifier_block *);
> extern int raw_notifier_chain_unregister(struct raw_notifier_head *,
> struct notifier_block *);
> +extern int srcu_notifier_chain_unregister(struct srcu_notifier_head *,
> + struct notifier_block *);
>
> extern int atomic_notifier_call_chain(struct atomic_notifier_head *,
> unsigned long val, void *v);
> @@ -105,6 +136,8 @@ extern int blocking_notifier_call_chain(
> unsigned long val, void *v);
> extern int raw_notifier_call_chain(struct raw_notifier_head *,
> unsigned long val, void *v);
> +extern int srcu_notifier_call_chain(struct srcu_notifier_head *,
> + unsigned long val, void *v);
>
> #define NOTIFY_DONE 0x0000 /* Don't care */
> #define NOTIFY_OK 0x0001 /* Suits me */
> Index: usb-2.6/include/linux/srcu.h
> ===================================================================
> --- usb-2.6.orig/include/linux/srcu.h
> +++ usb-2.6/include/linux/srcu.h
> @@ -24,6 +24,9 @@
> *
> */
>
> +#ifndef _LINUX_SRCU_H
> +#define _LINUX_SRCU_H
> +
> struct srcu_struct_array {
> int c[2];
> };
> @@ -47,3 +50,5 @@ void srcu_read_unlock(struct srcu_struct
> void synchronize_srcu(struct srcu_struct *sp);
> long srcu_batches_completed(struct srcu_struct *sp);
> void cleanup_srcu_struct(struct srcu_struct *sp);
> +
> +#endif
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: SRCU-based notifier chains
2006-07-11 17:39 ` Paul E. McKenney
@ 2006-07-11 18:03 ` Alan Stern
2006-07-11 18:22 ` Paul E. McKenney
2006-07-11 18:18 ` [PATCH] Add " Alan Stern
1 sibling, 1 reply; 13+ messages in thread
From: Alan Stern @ 2006-07-11 18:03 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, LKML
On Tue, 11 Jul 2006, Paul E. McKenney wrote:
> Looks sane to me. A couple of minor comments interspersed.
Okay, I'll submit it with a proper writeup.
> > +/*
> > + * SRCU notifier chain routines. Registration and unregistration
> > + * use a mutex, and call_chain is synchronized by SRCU (no locks).
> > + */
>
> Hmmm... Probably my just failing to pay attention, but haven't noticed
> the double-header-comment style before.
As far as I know, I made it up. It seemed appropriate, since the first
header applies to the entire group of three routines that follow whereas
the second header is kerneldoc just for the next function.
> > /*
> > - * Notifier chains are of three types:
> > + * Notifier chains are of four types:
>
> Is it possible to subsume one of the other three types?
>
> Might not be, but have to ask...
In principle we could replace blocking notifiers, but in practice we
can't.
We can't just substitute one for the other for two reasons: SRCU notifiers
need special initialization which the blocking notifiers don't have, and
SRCU notifiers have different time/space tradeoffs which might not be
appropriate for all existing blocking notifiers.
Alan Stern
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] Add SRCU-based notifier chains
2006-07-11 17:39 ` Paul E. McKenney
2006-07-11 18:03 ` Alan Stern
@ 2006-07-11 18:18 ` Alan Stern
2006-07-11 18:30 ` Paul E. McKenney
2006-07-12 0:56 ` Chandra Seetharaman
1 sibling, 2 replies; 13+ messages in thread
From: Alan Stern @ 2006-07-11 18:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Chandra Seetharaman, Paul E. McKenney, Matt Helsley,
Benjamin LaHaise, Kernel development list
This patch (as751) adds a new type of notifier chain, based on the SRCU
(Sleepable Read-Copy Update) primitives recently added to the kernel. An
SRCU notifier chain is much like a blocking notifier chain, in that it
must be called in process context and its callout routines are allowed to
sleep. The difference is that the chain's links are protected by the SRCU
mechanism rather than by an rw-semaphore, so calling the chain has
extremely low overhead: no memory barriers and no cache-line bouncing.
On the other hand, unregistering from the chain is expensive and the chain
head requires special runtime initialization (plus cleanup if it is to be
deallocated).
SRCU notifiers are appropriate for notifiers that will be called very
frequently and for which unregistration occurs very seldom. The proposed
"task notifier" scheme qualifies, as may some of the network notifiers.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
---
Index: usb-2.6/kernel/sys.c
===================================================================
--- usb-2.6.orig/kernel/sys.c
+++ usb-2.6/kernel/sys.c
@@ -151,7 +151,7 @@ static int __kprobes notifier_call_chain
/*
* Atomic notifier chain routines. Registration and unregistration
- * use a mutex, and call_chain is synchronized by RCU (no locks).
+ * use a spinlock, and call_chain is synchronized by RCU (no locks).
*/
/**
@@ -399,6 +399,128 @@ int raw_notifier_call_chain(struct raw_n
EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
+/*
+ * SRCU notifier chain routines. Registration and unregistration
+ * use a mutex, and call_chain is synchronized by SRCU (no locks).
+ */
+
+/**
+ * srcu_notifier_chain_register - Add notifier to an SRCU notifier chain
+ * @nh: Pointer to head of the SRCU notifier chain
+ * @n: New entry in notifier chain
+ *
+ * Adds a notifier to an SRCU notifier chain.
+ * Must be called in process context.
+ *
+ * Currently always returns zero.
+ */
+
+int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
+ struct notifier_block *n)
+{
+ int ret;
+
+ /*
+ * This code gets used during boot-up, when task switching is
+ * not yet working and interrupts must remain disabled. At
+ * such times we must not call mutex_lock().
+ */
+ if (unlikely(system_state == SYSTEM_BOOTING))
+ return notifier_chain_register(&nh->head, n);
+
+ mutex_lock(&nh->mutex);
+ ret = notifier_chain_register(&nh->head, n);
+ mutex_unlock(&nh->mutex);
+ return ret;
+}
+
+EXPORT_SYMBOL_GPL(srcu_notifier_chain_register);
+
+/**
+ * srcu_notifier_chain_unregister - Remove notifier from an SRCU notifier chain
+ * @nh: Pointer to head of the SRCU notifier chain
+ * @n: Entry to remove from notifier chain
+ *
+ * Removes a notifier from an SRCU notifier chain.
+ * Must be called from process context.
+ *
+ * Returns zero on success or %-ENOENT on failure.
+ */
+int srcu_notifier_chain_unregister(struct srcu_notifier_head *nh,
+ struct notifier_block *n)
+{
+ int ret;
+
+ /*
+ * This code gets used during boot-up, when task switching is
+ * not yet working and interrupts must remain disabled. At
+ * such times we must not call mutex_lock().
+ */
+ if (unlikely(system_state == SYSTEM_BOOTING))
+ return notifier_chain_unregister(&nh->head, n);
+
+ mutex_lock(&nh->mutex);
+ ret = notifier_chain_unregister(&nh->head, n);
+ mutex_unlock(&nh->mutex);
+ synchronize_srcu(&nh->srcu);
+ return ret;
+}
+
+EXPORT_SYMBOL_GPL(srcu_notifier_chain_unregister);
+
+/**
+ * srcu_notifier_call_chain - Call functions in an SRCU notifier chain
+ * @nh: Pointer to head of the SRCU notifier chain
+ * @val: Value passed unmodified to notifier function
+ * @v: Pointer passed unmodified to notifier function
+ *
+ * Calls each function in a notifier chain in turn. The functions
+ * run in a process context, so they are allowed to block.
+ *
+ * If the return value of the notifier can be and'ed
+ * with %NOTIFY_STOP_MASK then srcu_notifier_call_chain
+ * will return immediately, with the return value of
+ * the notifier function which halted execution.
+ * Otherwise the return value is the return value
+ * of the last notifier function called.
+ */
+
+int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
+ unsigned long val, void *v)
+{
+ int ret;
+ int idx;
+
+ idx = srcu_read_lock(&nh->srcu);
+ ret = notifier_call_chain(&nh->head, val, v);
+ srcu_read_unlock(&nh->srcu, idx);
+ return ret;
+}
+
+EXPORT_SYMBOL_GPL(srcu_notifier_call_chain);
+
+/**
+ * srcu_init_notifier_head - Initialize an SRCU notifier head
+ * @nh: Pointer to head of the srcu notifier chain
+ *
+ * Unlike other sorts of notifier heads, SRCU notifier heads require
+ * dynamic initialization. Be sure to call this routine before
+ * calling any of the other SRCU notifier routines for this head.
+ *
+ * If an SRCU notifier head is deallocated, it must first be cleaned
+ * up by calling srcu_cleanup_notifier_head(). Otherwise the head's
+ * per-cpu data (used by the SRCU mechanism) will leak.
+ */
+
+void srcu_init_notifier_head(struct srcu_notifier_head *nh)
+{
+ mutex_init(&nh->mutex);
+ init_srcu_struct(&nh->srcu);
+ nh->head = NULL;
+}
+
+EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
+
/**
* register_reboot_notifier - Register function to be called at reboot time
* @nb: Info about notifier function to be called
Index: usb-2.6/include/linux/notifier.h
===================================================================
--- usb-2.6.orig/include/linux/notifier.h
+++ usb-2.6/include/linux/notifier.h
@@ -12,9 +12,10 @@
#include <linux/errno.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
+#include <linux/srcu.h>
/*
- * Notifier chains are of three types:
+ * Notifier chains are of four types:
*
* Atomic notifier chains: Chain callbacks run in interrupt/atomic
* context. Callouts are not allowed to block.
@@ -23,13 +24,27 @@
* Raw notifier chains: There are no restrictions on callbacks,
* registration, or unregistration. All locking and protection
* must be provided by the caller.
+ * SRCU notifier chains: A variant of blocking notifier chains, with
+ * the same restrictions.
*
* atomic_notifier_chain_register() may be called from an atomic context,
- * but blocking_notifier_chain_register() must be called from a process
- * context. Ditto for the corresponding _unregister() routines.
+ * but blocking_notifier_chain_register() and srcu_notifier_chain_register()
+ * must be called from a process context. Ditto for the corresponding
+ * _unregister() routines.
*
- * atomic_notifier_chain_unregister() and blocking_notifier_chain_unregister()
- * _must not_ be called from within the call chain.
+ * atomic_notifier_chain_unregister(), blocking_notifier_chain_unregister(),
+ * and srcu_notifier_chain_unregister() _must not_ be called from within
+ * the call chain.
+ *
+ * SRCU notifier chains are an alternative form of blocking notifier chains.
+ * They use SRCU (Sleepable Read-Copy Update) instead of rw-semaphores for
+ * protection of the chain links. This means there is _very_ low overhead
+ * in srcu_notifier_call_chain(): no cache bounces and no memory barriers.
+ * As compensation, srcu_notifier_chain_unregister() is rather expensive.
+ * SRCU notifier chains should be used when the chain will be called very
+ * often but notifier_blocks will seldom be removed. Also, SRCU notifier
+ * chains are slightly more difficult to use because they require special
+ * runtime initialization.
*/
struct notifier_block {
@@ -52,6 +67,12 @@ struct raw_notifier_head {
struct notifier_block *head;
};
+struct srcu_notifier_head {
+ struct mutex mutex;
+ struct srcu_struct srcu;
+ struct notifier_block *head;
+};
+
#define ATOMIC_INIT_NOTIFIER_HEAD(name) do { \
spin_lock_init(&(name)->lock); \
(name)->head = NULL; \
@@ -64,6 +85,11 @@ struct raw_notifier_head {
(name)->head = NULL; \
} while (0)
+/* srcu_notifier_heads must be initialized and cleaned up dynamically */
+extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
+#define srcu_cleanup_notifier_head(name) \
+ cleanup_srcu_struct(&(name)->srcu);
+
#define ATOMIC_NOTIFIER_INIT(name) { \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.head = NULL }
@@ -72,6 +98,7 @@ struct raw_notifier_head {
.head = NULL }
#define RAW_NOTIFIER_INIT(name) { \
.head = NULL }
+/* srcu_notifier_heads cannot be initialized statically */
#define ATOMIC_NOTIFIER_HEAD(name) \
struct atomic_notifier_head name = \
@@ -91,6 +118,8 @@ extern int blocking_notifier_chain_regis
struct notifier_block *);
extern int raw_notifier_chain_register(struct raw_notifier_head *,
struct notifier_block *);
+extern int srcu_notifier_chain_register(struct srcu_notifier_head *,
+ struct notifier_block *);
extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *,
struct notifier_block *);
@@ -98,6 +127,8 @@ extern int blocking_notifier_chain_unreg
struct notifier_block *);
extern int raw_notifier_chain_unregister(struct raw_notifier_head *,
struct notifier_block *);
+extern int srcu_notifier_chain_unregister(struct srcu_notifier_head *,
+ struct notifier_block *);
extern int atomic_notifier_call_chain(struct atomic_notifier_head *,
unsigned long val, void *v);
@@ -105,6 +136,8 @@ extern int blocking_notifier_call_chain(
unsigned long val, void *v);
extern int raw_notifier_call_chain(struct raw_notifier_head *,
unsigned long val, void *v);
+extern int srcu_notifier_call_chain(struct srcu_notifier_head *,
+ unsigned long val, void *v);
#define NOTIFY_DONE 0x0000 /* Don't care */
#define NOTIFY_OK 0x0001 /* Suits me */
Index: usb-2.6/include/linux/srcu.h
===================================================================
--- usb-2.6.orig/include/linux/srcu.h
+++ usb-2.6/include/linux/srcu.h
@@ -24,6 +24,9 @@
*
*/
+#ifndef _LINUX_SRCU_H
+#define _LINUX_SRCU_H
+
struct srcu_struct_array {
int c[2];
};
@@ -47,3 +50,5 @@ void srcu_read_unlock(struct srcu_struct
void synchronize_srcu(struct srcu_struct *sp);
long srcu_batches_completed(struct srcu_struct *sp);
void cleanup_srcu_struct(struct srcu_struct *sp);
+
+#endif
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: SRCU-based notifier chains
2006-07-11 18:03 ` Alan Stern
@ 2006-07-11 18:22 ` Paul E. McKenney
0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-11 18:22 UTC (permalink / raw)
To: Alan Stern
Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
Darren Hart, oleg, Jes Sorensen, LKML
On Tue, Jul 11, 2006 at 02:03:50PM -0400, Alan Stern wrote:
> On Tue, 11 Jul 2006, Paul E. McKenney wrote:
>
> > Looks sane to me. A couple of minor comments interspersed.
>
> Okay, I'll submit it with a proper writeup.
>
> > > +/*
> > > + * SRCU notifier chain routines. Registration and unregistration
> > > + * use a mutex, and call_chain is synchronized by SRCU (no locks).
> > > + */
> >
> > Hmmm... Probably my just failing to pay attention, but haven't noticed
> > the double-header-comment style before.
>
> As far as I know, I made it up. It seemed appropriate, since the first
> header applies to the entire group of three routines that follow whereas
> the second header is kerneldoc just for the next function.
Fair enough -- I missed the fact that the first header applies to
all three functions.
> > > /*
> > > - * Notifier chains are of three types:
> > > + * Notifier chains are of four types:
> >
> > Is it possible to subsume one of the other three types?
> >
> > Might not be, but have to ask...
>
> In principle we could replace blocking notifiers, but in practice we
> can't.
>
> We can't just substitute one for the other for two reasons: SRCU notifiers
> need special initialization which the blocking notifiers don't have, and
> SRCU notifiers have different time/space tradeoffs which might not be
> appropriate for all existing blocking notifiers.
Again, fair enough!
Thanx, Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add SRCU-based notifier chains
2006-07-11 18:18 ` [PATCH] Add " Alan Stern
@ 2006-07-11 18:30 ` Paul E. McKenney
2006-07-12 0:56 ` Chandra Seetharaman
1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-11 18:30 UTC (permalink / raw)
To: Alan Stern
Cc: Andrew Morton, Chandra Seetharaman, Matt Helsley,
Benjamin LaHaise, Kernel development list
On Tue, Jul 11, 2006 at 02:18:53PM -0400, Alan Stern wrote:
> This patch (as751) adds a new type of notifier chain, based on the SRCU
> (Sleepable Read-Copy Update) primitives recently added to the kernel. An
> SRCU notifier chain is much like a blocking notifier chain, in that it
> must be called in process context and its callout routines are allowed to
> sleep. The difference is that the chain's links are protected by the SRCU
> mechanism rather than by an rw-semaphore, so calling the chain has
> extremely low overhead: no memory barriers and no cache-line bouncing.
> On the other hand, unregistering from the chain is expensive and the chain
> head requires special runtime initialization (plus cleanup if it is to be
> deallocated).
>
> SRCU notifiers are appropriate for notifiers that will be called very
> frequently and for which unregistration occurs very seldom. The proposed
> "task notifier" scheme qualifies, as may some of the network notifiers.
Looks good from an SRCU perspective! Looks like notifier_chain_register()
already contains the required rcu_assign_pointer(), and
notifier_call_chain() already contains the required rcu_dereference().
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
>
> ---
>
> Index: usb-2.6/kernel/sys.c
> ===================================================================
> --- usb-2.6.orig/kernel/sys.c
> +++ usb-2.6/kernel/sys.c
> @@ -151,7 +151,7 @@ static int __kprobes notifier_call_chain
>
> /*
> * Atomic notifier chain routines. Registration and unregistration
> - * use a mutex, and call_chain is synchronized by RCU (no locks).
> + * use a spinlock, and call_chain is synchronized by RCU (no locks).
> */
>
> /**
> @@ -399,6 +399,128 @@ int raw_notifier_call_chain(struct raw_n
>
> EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
>
> +/*
> + * SRCU notifier chain routines. Registration and unregistration
> + * use a mutex, and call_chain is synchronized by SRCU (no locks).
> + */
> +
> +/**
> + * srcu_notifier_chain_register - Add notifier to an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @n: New entry in notifier chain
> + *
> + * Adds a notifier to an SRCU notifier chain.
> + * Must be called in process context.
> + *
> + * Currently always returns zero.
> + */
> +
> +int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
> + struct notifier_block *n)
> +{
> + int ret;
> +
> + /*
> + * This code gets used during boot-up, when task switching is
> + * not yet working and interrupts must remain disabled. At
> + * such times we must not call mutex_lock().
> + */
> + if (unlikely(system_state == SYSTEM_BOOTING))
> + return notifier_chain_register(&nh->head, n);
> +
> + mutex_lock(&nh->mutex);
> + ret = notifier_chain_register(&nh->head, n);
> + mutex_unlock(&nh->mutex);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_chain_register);
> +
> +/**
> + * srcu_notifier_chain_unregister - Remove notifier from an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @n: Entry to remove from notifier chain
> + *
> + * Removes a notifier from an SRCU notifier chain.
> + * Must be called from process context.
> + *
> + * Returns zero on success or %-ENOENT on failure.
> + */
> +int srcu_notifier_chain_unregister(struct srcu_notifier_head *nh,
> + struct notifier_block *n)
> +{
> + int ret;
> +
> + /*
> + * This code gets used during boot-up, when task switching is
> + * not yet working and interrupts must remain disabled. At
> + * such times we must not call mutex_lock().
> + */
> + if (unlikely(system_state == SYSTEM_BOOTING))
> + return notifier_chain_unregister(&nh->head, n);
> +
> + mutex_lock(&nh->mutex);
> + ret = notifier_chain_unregister(&nh->head, n);
> + mutex_unlock(&nh->mutex);
> + synchronize_srcu(&nh->srcu);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_chain_unregister);
> +
> +/**
> + * srcu_notifier_call_chain - Call functions in an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @val: Value passed unmodified to notifier function
> + * @v: Pointer passed unmodified to notifier function
> + *
> + * Calls each function in a notifier chain in turn. The functions
> + * run in a process context, so they are allowed to block.
> + *
> + * If the return value of the notifier can be and'ed
> + * with %NOTIFY_STOP_MASK then srcu_notifier_call_chain
> + * will return immediately, with the return value of
> + * the notifier function which halted execution.
> + * Otherwise the return value is the return value
> + * of the last notifier function called.
> + */
> +
> +int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
> + unsigned long val, void *v)
> +{
> + int ret;
> + int idx;
> +
> + idx = srcu_read_lock(&nh->srcu);
> + ret = notifier_call_chain(&nh->head, val, v);
> + srcu_read_unlock(&nh->srcu, idx);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_call_chain);
> +
> +/**
> + * srcu_init_notifier_head - Initialize an SRCU notifier head
> + * @nh: Pointer to head of the srcu notifier chain
> + *
> + * Unlike other sorts of notifier heads, SRCU notifier heads require
> + * dynamic initialization. Be sure to call this routine before
> + * calling any of the other SRCU notifier routines for this head.
> + *
> + * If an SRCU notifier head is deallocated, it must first be cleaned
> + * up by calling srcu_cleanup_notifier_head(). Otherwise the head's
> + * per-cpu data (used by the SRCU mechanism) will leak.
> + */
> +
> +void srcu_init_notifier_head(struct srcu_notifier_head *nh)
> +{
> + mutex_init(&nh->mutex);
> + init_srcu_struct(&nh->srcu);
> + nh->head = NULL;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
> +
> /**
> * register_reboot_notifier - Register function to be called at reboot time
> * @nb: Info about notifier function to be called
> Index: usb-2.6/include/linux/notifier.h
> ===================================================================
> --- usb-2.6.orig/include/linux/notifier.h
> +++ usb-2.6/include/linux/notifier.h
> @@ -12,9 +12,10 @@
> #include <linux/errno.h>
> #include <linux/mutex.h>
> #include <linux/rwsem.h>
> +#include <linux/srcu.h>
>
> /*
> - * Notifier chains are of three types:
> + * Notifier chains are of four types:
> *
> * Atomic notifier chains: Chain callbacks run in interrupt/atomic
> * context. Callouts are not allowed to block.
> @@ -23,13 +24,27 @@
> * Raw notifier chains: There are no restrictions on callbacks,
> * registration, or unregistration. All locking and protection
> * must be provided by the caller.
> + * SRCU notifier chains: A variant of blocking notifier chains, with
> + * the same restrictions.
> *
> * atomic_notifier_chain_register() may be called from an atomic context,
> - * but blocking_notifier_chain_register() must be called from a process
> - * context. Ditto for the corresponding _unregister() routines.
> + * but blocking_notifier_chain_register() and srcu_notifier_chain_register()
> + * must be called from a process context. Ditto for the corresponding
> + * _unregister() routines.
> *
> - * atomic_notifier_chain_unregister() and blocking_notifier_chain_unregister()
> - * _must not_ be called from within the call chain.
> + * atomic_notifier_chain_unregister(), blocking_notifier_chain_unregister(),
> + * and srcu_notifier_chain_unregister() _must not_ be called from within
> + * the call chain.
> + *
> + * SRCU notifier chains are an alternative form of blocking notifier chains.
> + * They use SRCU (Sleepable Read-Copy Update) instead of rw-semaphores for
> + * protection of the chain links. This means there is _very_ low overhead
> + * in srcu_notifier_call_chain(): no cache bounces and no memory barriers.
> + * As compensation, srcu_notifier_chain_unregister() is rather expensive.
> + * SRCU notifier chains should be used when the chain will be called very
> + * often but notifier_blocks will seldom be removed. Also, SRCU notifier
> + * chains are slightly more difficult to use because they require special
> + * runtime initialization.
> */
>
> struct notifier_block {
> @@ -52,6 +67,12 @@ struct raw_notifier_head {
> struct notifier_block *head;
> };
>
> +struct srcu_notifier_head {
> + struct mutex mutex;
> + struct srcu_struct srcu;
> + struct notifier_block *head;
> +};
> +
> #define ATOMIC_INIT_NOTIFIER_HEAD(name) do { \
> spin_lock_init(&(name)->lock); \
> (name)->head = NULL; \
> @@ -64,6 +85,11 @@ struct raw_notifier_head {
> (name)->head = NULL; \
> } while (0)
>
> +/* srcu_notifier_heads must be initialized and cleaned up dynamically */
> +extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
> +#define srcu_cleanup_notifier_head(name) \
> + cleanup_srcu_struct(&(name)->srcu);
> +
> #define ATOMIC_NOTIFIER_INIT(name) { \
> .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
> .head = NULL }
> @@ -72,6 +98,7 @@ struct raw_notifier_head {
> .head = NULL }
> #define RAW_NOTIFIER_INIT(name) { \
> .head = NULL }
> +/* srcu_notifier_heads cannot be initialized statically */
>
> #define ATOMIC_NOTIFIER_HEAD(name) \
> struct atomic_notifier_head name = \
> @@ -91,6 +118,8 @@ extern int blocking_notifier_chain_regis
> struct notifier_block *);
> extern int raw_notifier_chain_register(struct raw_notifier_head *,
> struct notifier_block *);
> +extern int srcu_notifier_chain_register(struct srcu_notifier_head *,
> + struct notifier_block *);
>
> extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *,
> struct notifier_block *);
> @@ -98,6 +127,8 @@ extern int blocking_notifier_chain_unreg
> struct notifier_block *);
> extern int raw_notifier_chain_unregister(struct raw_notifier_head *,
> struct notifier_block *);
> +extern int srcu_notifier_chain_unregister(struct srcu_notifier_head *,
> + struct notifier_block *);
>
> extern int atomic_notifier_call_chain(struct atomic_notifier_head *,
> unsigned long val, void *v);
> @@ -105,6 +136,8 @@ extern int blocking_notifier_call_chain(
> unsigned long val, void *v);
> extern int raw_notifier_call_chain(struct raw_notifier_head *,
> unsigned long val, void *v);
> +extern int srcu_notifier_call_chain(struct srcu_notifier_head *,
> + unsigned long val, void *v);
>
> #define NOTIFY_DONE 0x0000 /* Don't care */
> #define NOTIFY_OK 0x0001 /* Suits me */
> Index: usb-2.6/include/linux/srcu.h
> ===================================================================
> --- usb-2.6.orig/include/linux/srcu.h
> +++ usb-2.6/include/linux/srcu.h
> @@ -24,6 +24,9 @@
> *
> */
>
> +#ifndef _LINUX_SRCU_H
> +#define _LINUX_SRCU_H
> +
> struct srcu_struct_array {
> int c[2];
> };
> @@ -47,3 +50,5 @@ void srcu_read_unlock(struct srcu_struct
> void synchronize_srcu(struct srcu_struct *sp);
> long srcu_batches_completed(struct srcu_struct *sp);
> void cleanup_srcu_struct(struct srcu_struct *sp);
> +
> +#endif
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add SRCU-based notifier chains
2006-07-11 18:18 ` [PATCH] Add " Alan Stern
2006-07-11 18:30 ` Paul E. McKenney
@ 2006-07-12 0:56 ` Chandra Seetharaman
1 sibling, 0 replies; 13+ messages in thread
From: Chandra Seetharaman @ 2006-07-12 0:56 UTC (permalink / raw)
To: Alan Stern
Cc: Andrew Morton, Paul E. McKenney, Matt Helsley, Benjamin LaHaise,
Kernel development list
On Tue, 2006-07-11 at 14:18 -0400, Alan Stern wrote:
> This patch (as751) adds a new type of notifier chain, based on the SRCU
> (Sleepable Read-Copy Update) primitives recently added to the kernel. An
> SRCU notifier chain is much like a blocking notifier chain, in that it
> must be called in process context and its callout routines are allowed to
> sleep. The difference is that the chain's links are protected by the SRCU
> mechanism rather than by an rw-semaphore, so calling the chain has
> extremely low overhead: no memory barriers and no cache-line bouncing.
> On the other hand, unregistering from the chain is expensive and the chain
> head requires special runtime initialization (plus cleanup if it is to be
> deallocated).
>
> SRCU notifiers are appropriate for notifiers that will be called very
> frequently and for which unregistration occurs very seldom. The proposed
> "task notifier" scheme qualifies, as may some of the network notifiers.
>
>
Looks good from the notifier mechanism perspective.
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
>
> ---
>
> Index: usb-2.6/kernel/sys.c
> ===================================================================
> --- usb-2.6.orig/kernel/sys.c
> +++ usb-2.6/kernel/sys.c
> @@ -151,7 +151,7 @@ static int __kprobes notifier_call_chain
>
> /*
> * Atomic notifier chain routines. Registration and unregistration
> - * use a mutex, and call_chain is synchronized by RCU (no locks).
> + * use a spinlock, and call_chain is synchronized by RCU (no locks).
> */
>
> /**
> @@ -399,6 +399,128 @@ int raw_notifier_call_chain(struct raw_n
>
> EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
>
> +/*
> + * SRCU notifier chain routines. Registration and unregistration
> + * use a mutex, and call_chain is synchronized by SRCU (no locks).
> + */
> +
> +/**
> + * srcu_notifier_chain_register - Add notifier to an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @n: New entry in notifier chain
> + *
> + * Adds a notifier to an SRCU notifier chain.
> + * Must be called in process context.
> + *
> + * Currently always returns zero.
> + */
> +
> +int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
> + struct notifier_block *n)
> +{
> + int ret;
> +
> + /*
> + * This code gets used during boot-up, when task switching is
> + * not yet working and interrupts must remain disabled. At
> + * such times we must not call mutex_lock().
> + */
> + if (unlikely(system_state == SYSTEM_BOOTING))
> + return notifier_chain_register(&nh->head, n);
> +
> + mutex_lock(&nh->mutex);
> + ret = notifier_chain_register(&nh->head, n);
> + mutex_unlock(&nh->mutex);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_chain_register);
> +
> +/**
> + * srcu_notifier_chain_unregister - Remove notifier from an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @n: Entry to remove from notifier chain
> + *
> + * Removes a notifier from an SRCU notifier chain.
> + * Must be called from process context.
> + *
> + * Returns zero on success or %-ENOENT on failure.
> + */
> +int srcu_notifier_chain_unregister(struct srcu_notifier_head *nh,
> + struct notifier_block *n)
> +{
> + int ret;
> +
> + /*
> + * This code gets used during boot-up, when task switching is
> + * not yet working and interrupts must remain disabled. At
> + * such times we must not call mutex_lock().
> + */
> + if (unlikely(system_state == SYSTEM_BOOTING))
> + return notifier_chain_unregister(&nh->head, n);
> +
> + mutex_lock(&nh->mutex);
> + ret = notifier_chain_unregister(&nh->head, n);
> + mutex_unlock(&nh->mutex);
> + synchronize_srcu(&nh->srcu);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_chain_unregister);
> +
> +/**
> + * srcu_notifier_call_chain - Call functions in an SRCU notifier chain
> + * @nh: Pointer to head of the SRCU notifier chain
> + * @val: Value passed unmodified to notifier function
> + * @v: Pointer passed unmodified to notifier function
> + *
> + * Calls each function in a notifier chain in turn. The functions
> + * run in a process context, so they are allowed to block.
> + *
> + * If the return value of the notifier can be and'ed
> + * with %NOTIFY_STOP_MASK then srcu_notifier_call_chain
> + * will return immediately, with the return value of
> + * the notifier function which halted execution.
> + * Otherwise the return value is the return value
> + * of the last notifier function called.
> + */
> +
> +int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
> + unsigned long val, void *v)
> +{
> + int ret;
> + int idx;
> +
> + idx = srcu_read_lock(&nh->srcu);
> + ret = notifier_call_chain(&nh->head, val, v);
> + srcu_read_unlock(&nh->srcu, idx);
> + return ret;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_notifier_call_chain);
> +
> +/**
> + * srcu_init_notifier_head - Initialize an SRCU notifier head
> + * @nh: Pointer to head of the srcu notifier chain
> + *
> + * Unlike other sorts of notifier heads, SRCU notifier heads require
> + * dynamic initialization. Be sure to call this routine before
> + * calling any of the other SRCU notifier routines for this head.
> + *
> + * If an SRCU notifier head is deallocated, it must first be cleaned
> + * up by calling srcu_cleanup_notifier_head(). Otherwise the head's
> + * per-cpu data (used by the SRCU mechanism) will leak.
> + */
> +
> +void srcu_init_notifier_head(struct srcu_notifier_head *nh)
> +{
> + mutex_init(&nh->mutex);
> + init_srcu_struct(&nh->srcu);
> + nh->head = NULL;
> +}
> +
> +EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
> +
> /**
> * register_reboot_notifier - Register function to be called at reboot time
> * @nb: Info about notifier function to be called
> Index: usb-2.6/include/linux/notifier.h
> ===================================================================
> --- usb-2.6.orig/include/linux/notifier.h
> +++ usb-2.6/include/linux/notifier.h
> @@ -12,9 +12,10 @@
> #include <linux/errno.h>
> #include <linux/mutex.h>
> #include <linux/rwsem.h>
> +#include <linux/srcu.h>
>
> /*
> - * Notifier chains are of three types:
> + * Notifier chains are of four types:
> *
> * Atomic notifier chains: Chain callbacks run in interrupt/atomic
> * context. Callouts are not allowed to block.
> @@ -23,13 +24,27 @@
> * Raw notifier chains: There are no restrictions on callbacks,
> * registration, or unregistration. All locking and protection
> * must be provided by the caller.
> + * SRCU notifier chains: A variant of blocking notifier chains, with
> + * the same restrictions.
> *
> * atomic_notifier_chain_register() may be called from an atomic context,
> - * but blocking_notifier_chain_register() must be called from a process
> - * context. Ditto for the corresponding _unregister() routines.
> + * but blocking_notifier_chain_register() and srcu_notifier_chain_register()
> + * must be called from a process context. Ditto for the corresponding
> + * _unregister() routines.
> *
> - * atomic_notifier_chain_unregister() and blocking_notifier_chain_unregister()
> - * _must not_ be called from within the call chain.
> + * atomic_notifier_chain_unregister(), blocking_notifier_chain_unregister(),
> + * and srcu_notifier_chain_unregister() _must not_ be called from within
> + * the call chain.
> + *
> + * SRCU notifier chains are an alternative form of blocking notifier chains.
> + * They use SRCU (Sleepable Read-Copy Update) instead of rw-semaphores for
> + * protection of the chain links. This means there is _very_ low overhead
> + * in srcu_notifier_call_chain(): no cache bounces and no memory barriers.
> + * As compensation, srcu_notifier_chain_unregister() is rather expensive.
> + * SRCU notifier chains should be used when the chain will be called very
> + * often but notifier_blocks will seldom be removed. Also, SRCU notifier
> + * chains are slightly more difficult to use because they require special
> + * runtime initialization.
> */
>
> struct notifier_block {
> @@ -52,6 +67,12 @@ struct raw_notifier_head {
> struct notifier_block *head;
> };
>
> +struct srcu_notifier_head {
> + struct mutex mutex;
> + struct srcu_struct srcu;
> + struct notifier_block *head;
> +};
> +
> #define ATOMIC_INIT_NOTIFIER_HEAD(name) do { \
> spin_lock_init(&(name)->lock); \
> (name)->head = NULL; \
> @@ -64,6 +85,11 @@ struct raw_notifier_head {
> (name)->head = NULL; \
> } while (0)
>
> +/* srcu_notifier_heads must be initialized and cleaned up dynamically */
> +extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
> +#define srcu_cleanup_notifier_head(name) \
> + cleanup_srcu_struct(&(name)->srcu);
> +
> #define ATOMIC_NOTIFIER_INIT(name) { \
> .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
> .head = NULL }
> @@ -72,6 +98,7 @@ struct raw_notifier_head {
> .head = NULL }
> #define RAW_NOTIFIER_INIT(name) { \
> .head = NULL }
> +/* srcu_notifier_heads cannot be initialized statically */
>
> #define ATOMIC_NOTIFIER_HEAD(name) \
> struct atomic_notifier_head name = \
> @@ -91,6 +118,8 @@ extern int blocking_notifier_chain_regis
> struct notifier_block *);
> extern int raw_notifier_chain_register(struct raw_notifier_head *,
> struct notifier_block *);
> +extern int srcu_notifier_chain_register(struct srcu_notifier_head *,
> + struct notifier_block *);
>
> extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *,
> struct notifier_block *);
> @@ -98,6 +127,8 @@ extern int blocking_notifier_chain_unreg
> struct notifier_block *);
> extern int raw_notifier_chain_unregister(struct raw_notifier_head *,
> struct notifier_block *);
> +extern int srcu_notifier_chain_unregister(struct srcu_notifier_head *,
> + struct notifier_block *);
>
> extern int atomic_notifier_call_chain(struct atomic_notifier_head *,
> unsigned long val, void *v);
> @@ -105,6 +136,8 @@ extern int blocking_notifier_call_chain(
> unsigned long val, void *v);
> extern int raw_notifier_call_chain(struct raw_notifier_head *,
> unsigned long val, void *v);
> +extern int srcu_notifier_call_chain(struct srcu_notifier_head *,
> + unsigned long val, void *v);
>
> #define NOTIFY_DONE 0x0000 /* Don't care */
> #define NOTIFY_OK 0x0001 /* Suits me */
> Index: usb-2.6/include/linux/srcu.h
> ===================================================================
> --- usb-2.6.orig/include/linux/srcu.h
> +++ usb-2.6/include/linux/srcu.h
> @@ -24,6 +24,9 @@
> *
> */
>
> +#ifndef _LINUX_SRCU_H
> +#define _LINUX_SRCU_H
> +
> struct srcu_struct_array {
> int c[2];
> };
> @@ -47,3 +50,5 @@ void srcu_read_unlock(struct srcu_struct
> void synchronize_srcu(struct srcu_struct *sp);
> long srcu_batches_completed(struct srcu_struct *sp);
> void cleanup_srcu_struct(struct srcu_struct *sp);
> +
> +#endif
>
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-07-12 0:57 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.44L0.0607061603320.5768-100000@iolanthe.rowland.org>
[not found] ` <1152226204.21787.2093.camel@stark>
2006-07-06 23:39 ` [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking Paul E. McKenney
[not found] ` <Pine.LNX.4.44L0.0607071051430.17135-100000@iolanthe.rowland.org>
2006-07-07 16:33 ` Paul E. McKenney
[not found] ` <Pine.LNX.4.44L0.0607071345270.6793-100000@iolanthe.rowland.org>
2006-07-07 18:59 ` Paul E. McKenney
2006-07-07 19:59 ` Alan Stern
2006-07-07 21:11 ` Matt Helsley
2006-07-07 21:47 ` Paul E. McKenney
2006-07-10 19:11 ` SRCU-based notifier chains Alan Stern
2006-07-11 17:39 ` Paul E. McKenney
2006-07-11 18:03 ` Alan Stern
2006-07-11 18:22 ` Paul E. McKenney
2006-07-11 18:18 ` [PATCH] Add " Alan Stern
2006-07-11 18:30 ` Paul E. McKenney
2006-07-12 0:56 ` Chandra Seetharaman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox