* Re: [PATCH] small fixes in brlock.h
2003-03-10 2:33 ` Linus Torvalds
@ 2003-03-10 21:54 ` Roman Zippel
2003-03-12 0:13 ` [PATCH] (0/8) replace brlock with RCU Stephen Hemminger
` (8 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Roman Zippel @ 2003-03-10 21:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Robert Love, Zwane Mwaikambo, Linux Kernel
Hi,
On Sun, 9 Mar 2003, Linus Torvalds wrote:
> No, I don't think there are even "moved to RCU" users. It's just never
> been used very much, since the writes are _so_ expensive (in fact, there
> have been various live-locks on the writer side, the whole brlock thing is
> really questionable).
>
> It's entirely possible that the current user could be replaced by RCU
> and/or seqlocks, and we could get rid of brlocks entirely.
I think now it's a good time to throw the patch below into the ring. :)
This new lock is an extension of the brlock and has an asynchronous and a
synchronous mode. The synchronous mode can be used to replaced the brlock
and does something very similiar than the nonatomic version of the brlock,
but has a better worst case behaviour.
The asynchronous mode can be used to replaced RCU. The read lock is very
cheap and never sleeps, but it's enough to make the quiescent stage a lot
cheaper and more flexible to use. This means instead of calling call_rcu()
it's enough to call brs_read_sync() and immediately after this all reader
references are gone.
I'm meditating over this patch already some time, but I can't find a flaw.
OTOH it looks very promising to make locking in several areas a lot
cheaper. I have another patch laying around, which completely replaces the
dcache_lock with this lock. Additionally I only need an asynchronous read
lock during d_lookup and dput and is compatible with the fast walk code,
so it should be even faster than dcache rcu, but requires of course a few
more changes. This patch is far away from ready, as some bits are still
missing, but I think I figured out the basic locking rules. If anyone
wants to look at it, I'll send him a copy.
bye, Roman
diff -Nur linux-2.5.org/include/linux/brslock.h linux-2.5/include/linux/brslock.h
--- linux-2.5.org/include/linux/brslock.h 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.5/include/linux/brslock.h 2003-03-02 02:59:24.000000000 +0100
@@ -0,0 +1,154 @@
+#ifndef __LINUX_BRSLOCK_H
+#define __LINUX_BRSLOCK_H
+
+/*
+ * 'Big Reader / Sync' read-write spinlocks.
+ *
+ * super-fast read/write locks, with write-side penalty and
+ * a synchronous and asynchronous version.
+ *
+ * (c) 2003, Roman Zippel <zippel@linux-m68k.org>
+ * Released under the terms of the GNU GPL v2.0.
+ *
+ * This lock is based on the brlock by Ingo Molnar and David S. Miller
+ * and rcu techniques.
+ *
+ * The asynchronous read lock only marks the critical regions, but never waits
+ * for any writer, so the writer can only detect active readers, this is
+ * sufficient for asynchronous list updates as used by the rcu mechanism, but
+ * has the advantage that after a call to brs_read_sync the update is finished
+ * and no further callback is required.
+ * The synchronous read lock extends the asynchronous lock by adding a write
+ * test, but since it's in the same cache line as the count, it's very cheap.
+ * The writer can now prevent new readers from entering the critical region
+ * and only needs to wait for other readers to leave it.
+ */
+
+#include <linux/config.h>
+
+#ifdef CONFIG_SMP
+
+#include <linux/cache.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+
+enum brslock_indices {
+ __BRS_END
+};
+
+/*
+ * This causes a link-time bug message if an
+ * invalid index is used:
+ */
+extern void __brs_lock_usage_bug(void);
+#define __brs_lock_idx_check(idx) \
+ if (idx >= __BRS_END) \
+ __brs_lock_usage_bug()
+
+struct brs_lock {
+ unsigned int count;
+ unsigned int wlock;
+};
+
+/*
+ * align last allocated index to the next cacheline:
+ */
+#define __BRS_IDX_MAX \
+ (((sizeof(struct brs_lock)*__BRS_END + SMP_CACHE_BYTES-1) & ~(SMP_CACHE_BYTES-1)) / sizeof(struct brs_lock))
+
+extern struct brs_lock __brslock_array[NR_CPUS][__BRS_IDX_MAX];
+extern spinlock_t __brs_write_locks[__BRS_END];
+
+extern void __brs_read_lock_sync(enum brslock_indices idx);
+extern void __brs_read_sync(enum brslock_indices idx);
+extern void __brs_write_lock_sync(enum brslock_indices idx);
+extern void __brs_write_unlock_sync(enum brslock_indices idx);
+
+static inline void __brs_read_lock_async(enum brslock_indices idx, int id)
+{
+ __brslock_array[id][idx].count++;
+ mb();
+}
+
+static inline void brs_read_lock_async(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ __brs_read_lock_async(idx, get_cpu());
+}
+
+static inline void __brs_read_unlock_async(enum brslock_indices idx, int id)
+{
+ mb();
+ __brslock_array[id][idx].count--;
+}
+
+static inline void brs_read_unlock_async(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ __brs_read_unlock_async(idx, smp_processor_id());
+ put_cpu();
+}
+
+static inline void brs_write_lock_async(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ spin_lock(&__brs_write_locks[idx]);
+}
+
+static inline void brs_write_unlock_async(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ spin_unlock(&__brs_write_locks[idx]);
+}
+
+static inline void brs_read_sync(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ __brs_read_sync(idx);
+}
+
+static inline void brs_read_lock_sync(enum brslock_indices idx)
+{
+ int id;
+restart:
+ id = get_cpu();
+ __brs_lock_idx_check(idx);
+ __brs_read_lock_async(idx, id);
+ if (unlikely(__brslock_array[id][idx].wlock)) {
+ __brs_read_lock_sync(idx);
+ goto restart;
+ }
+}
+
+static inline void brs_read_unlock_sync(enum brslock_indices idx)
+{
+ brs_read_unlock_async(idx);
+}
+
+static inline void brs_write_lock_sync(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ __brs_write_lock_sync(idx);
+}
+
+static inline void brs_write_unlock_sync(enum brslock_indices idx)
+{
+ __brs_lock_idx_check(idx);
+ __brs_write_unlock_sync(idx);
+}
+
+#else
+
+#define brs_read_lock_async(idx) ({ (void)(idx); preempt_disable(); })
+#define brs_read_unlock_async(idx) ({ (void)(idx); preempt_enable(); })
+#define brs_write_lock_async(idx) ({ (void)(idx); preempt_disable(); })
+#define brs_write_unlock_async(idx) ({ (void)(idx); preempt_enable(); })
+#define brs_read_sync(idx) ({ (void)(idx); })
+#define brs_read_lock_sync(idx) ({ (void)(idx); preempt_disable(); })
+#define brs_read_unlock_sync(idx) ({ (void)(idx); preempt_enable(); })
+#define brs_write_lock_sync(idx) ({ (void)(idx); preempt_disable(); })
+#define brs_write_unlock_sync(idx) ({ (void)(idx); preempt_enable(); })
+
+#endif /* CONFIG_SMP */
+
+#endif /* __LINUX_BRSLOCK_H */
diff -Nur linux-2.5.org/lib/Makefile linux-2.5/lib/Makefile
--- linux-2.5.org/lib/Makefile 2003-03-10 20:38:48.000000000 +0100
+++ linux-2.5/lib/Makefile 2003-03-10 20:39:39.000000000 +0100
@@ -8,7 +8,7 @@
L_TARGET := lib.a
-obj-y := errno.o ctype.o string.o vsprintf.o brlock.o cmdline.o \
+obj-y := errno.o ctype.o string.o vsprintf.o brlock.o brslock.o cmdline.o \
bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
kobject.o idr.o
diff -Nur linux-2.5.org/lib/brslock.c linux-2.5/lib/brslock.c
--- linux-2.5.org/lib/brslock.c 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.5/lib/brslock.c 2003-03-10 19:59:42.000000000 +0100
@@ -0,0 +1,56 @@
+/*
+ * Copyright (C) 2003 Roman Zippel <zippel@linux-m68k.org>
+ * Released under the terms of the GNU GPL v2.0.
+ */
+
+#include <linux/config.h>
+
+#ifdef CONFIG_SMP
+
+#include <linux/brslock.h>
+
+struct brs_lock __brslock_array[NR_CPUS][__BRS_IDX_MAX];
+spinlock_t __brs_write_locks[__BRS_END];
+
+void __brs_read_lock_sync(enum brslock_indices idx)
+{
+ brs_read_unlock(idx);
+ while (__brslock_array[smp_processor_id()][idx].wlock)
+ cpu_relax();
+}
+
+void __brs_read_sync(enum brslock_indices idx)
+{
+ int mask, i, j;
+
+ mask = cpu_online_map & ~(1 << smp_processor_id());
+ while (mask) {
+ for (i = 0, j = 1; i < NR_CPUS; j <<= 1, i++) {
+ if ((mask & j) && !__brslock_array[i][idx].count)
+ mask &= ~j;
+ }
+ cpu_relax();
+ }
+}
+
+void __brs_write_lock_sync(enum brslock_indices idx)
+{
+ int i;
+
+ spin_lock(&__brs_write_locks[idx]);
+ for (i = 0; i < NR_CPUS; i++)
+ __brslock_array[i][idx].wlock = 1;
+ mb();
+ __brs_read_sync(idx);
+}
+
+void __brs_write_unlock_sync(enum brslock_indices idx)
+{
+ int i;
+
+ for (i = 0; i < NR_CPUS; i++)
+ __brslock_array[i][idx].wlock = 0;
+ spin_unlock(&__brs_write_locks[idx]);
+}
+
+#endif /* CONFIG_SMP */
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH] (0/8) replace brlock with RCU
2003-03-10 2:33 ` Linus Torvalds
2003-03-10 21:54 ` Roman Zippel
@ 2003-03-12 0:13 ` Stephen Hemminger
2003-03-12 0:23 ` Linus Torvalds
2003-03-12 0:14 ` [PATCH] (1/8) Eliminate brlock in psnap Stephen Hemminger
` (7 subsequent siblings)
9 siblings, 1 reply; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:13 UTC (permalink / raw)
To: Linus Torvalds, David Miller
Cc: Robert Love, Zwane Mwaikambo, Linux Kernel Mailing List,
linux-net
> estionable).
>
> It's entirely possible that the current user could be replaced by RCU
> and/or seqlocks, and we could get rid of brlocks entirely.
>
> Linus
The following sequence of patches replaces the remaining use of brlock
with RCU. Most of this is fairly straightforward. The unregister functions
use synchronize_kernel(), perhaps there should be a special version to
indicate sychronizing with network BH.
Tested the normal network path and IPV4.
For the others all the pieces compile. But don't have the test setup to
give VLAN, SNAP, IPV6, and all the netfilter code a real test.
Comments?
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH] (0/8) replace brlock with RCU
2003-03-12 0:13 ` [PATCH] (0/8) replace brlock with RCU Stephen Hemminger
@ 2003-03-12 0:23 ` Linus Torvalds
2003-03-12 0:28 ` David S. Miller
0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2003-03-12 0:23 UTC (permalink / raw)
To: Stephen Hemminger
Cc: David Miller, Robert Love, Zwane Mwaikambo,
Linux Kernel Mailing List, linux-net
On 11 Mar 2003, Stephen Hemminger wrote:
>
> The following sequence of patches replaces the remaining use of brlock
> with RCU. Most of this is fairly straightforward. The unregister functions
> use synchronize_kernel(), perhaps there should be a special version to
> indicate sychronizing with network BH.
>
> Comments?
I'm not going to take this directly, but if it passes muster with David,
I'm happy. The fewer locking primitives we need, the better, and brlocks
have had enough problems that I wouldn't mind getting rid of them.
Linus
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (0/8) replace brlock with RCU
2003-03-12 0:23 ` Linus Torvalds
@ 2003-03-12 0:28 ` David S. Miller
2003-03-12 1:36 ` Roman Zippel
0 siblings, 1 reply; 31+ messages in thread
From: David S. Miller @ 2003-03-12 0:28 UTC (permalink / raw)
To: torvalds; +Cc: shemminger, rml, zwane, linux-kernel, linux-net
From: Linus Torvalds <torvalds@transmeta.com>
Date: Tue, 11 Mar 2003 16:23:24 -0800 (PST)
On 11 Mar 2003, Stephen Hemminger wrote:
>
> The following sequence of patches replaces the remaining use of brlock
> with RCU. Most of this is fairly straightforward. The unregister functions
> use synchronize_kernel(), perhaps there should be a special version to
> indicate sychronizing with network BH.
>
> Comments?
I'm not going to take this directly, but if it passes muster with David,
I'm happy. The fewer locking primitives we need, the better, and brlocks
have had enough problems that I wouldn't mind getting rid of them.
I'm fine with it, as long as I get shown how to get the equivalent
atomic sequence using the new primitives. Ie. is there still a way
to go:
stop_all_incoming_packets();
do_something();
resume_all_incoming_packets();
with the new stuff?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (0/8) replace brlock with RCU
2003-03-12 0:28 ` David S. Miller
@ 2003-03-12 1:36 ` Roman Zippel
0 siblings, 0 replies; 31+ messages in thread
From: Roman Zippel @ 2003-03-12 1:36 UTC (permalink / raw)
To: David S. Miller; +Cc: torvalds, shemminger, rml, zwane, linux-kernel, linux-net
Hi,
On Tue, 11 Mar 2003, David S. Miller wrote:
> I'm fine with it, as long as I get shown how to get the equivalent
> atomic sequence using the new primitives. Ie. is there still a way
> to go:
>
> stop_all_incoming_packets();
> do_something();
> resume_all_incoming_packets();
>
> with the new stuff?
BTW if anyone is interested in a brlock implementation, which can offer
this property, but can also beat rcu, you might want to look at this
patch http://marc.theaimsgroup.com/?l=linux-kernel&m=104733360501112&w=2
bye, Roman
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH] (1/8) Eliminate brlock in psnap
2003-03-10 2:33 ` Linus Torvalds
2003-03-10 21:54 ` Roman Zippel
2003-03-12 0:13 ` [PATCH] (0/8) replace brlock with RCU Stephen Hemminger
@ 2003-03-12 0:14 ` Stephen Hemminger
2003-03-12 1:03 ` Roman Zippel
2003-03-12 1:13 ` Zwane Mwaikambo
2003-03-12 0:14 ` [PATCH] (2/8) Eliminate brlock for packet_type Stephen Hemminger
` (6 subsequent siblings)
9 siblings, 2 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:14 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
The following replaces brlock with RCU in the SNAP module.
diff -urN -X dontdiff linux-2.5.64/net/802/psnap.c linux-2.5-nobrlock/net/802/psnap.c
--- linux-2.5.64/net/802/psnap.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/802/psnap.c 2003-03-10 15:48:53.000000000 -0800
@@ -21,10 +21,10 @@
#include <linux/mm.h>
#include <linux/in.h>
#include <linux/init.h>
-#include <linux/brlock.h>
LIST_HEAD(snap_list);
static struct llc_sap *snap_sap;
+static spinlock_t snap_lock = SPIN_LOCK_UNLOCKED;
/*
* Find a snap client by matching the 5 bytes.
@@ -34,17 +34,15 @@
struct list_head *entry;
struct datalink_proto *proto = NULL, *p;
- if (list_empty(&snap_list))
- goto out;
-
- list_for_each(entry, &snap_list) {
+ rcu_read_lock();
+ list_for_each_rcu(entry, &snap_list) {
p = list_entry(entry, struct datalink_proto, node);
if (!memcmp(p->type, desc, 5)) {
proto = p;
break;
}
}
-out:
+ rcu_read_unlock();
return proto;
}
@@ -124,8 +122,7 @@
{
struct datalink_proto *proto = NULL;
- br_write_lock_bh(BR_NETPROTO_LOCK);
-
+ spin_lock_bh(&snap_lock);
if (find_snap_client(desc))
goto out;
@@ -135,10 +132,10 @@
proto->rcvfunc = rcvfunc;
proto->header_length = 5 + 3; /* snap + 802.2 */
proto->request = snap_request;
- list_add(&proto->node, &snap_list);
+ list_add_rcu(&proto->node, &snap_list);
}
out:
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&snap_lock);
return proto;
}
@@ -147,12 +144,13 @@
*/
void unregister_snap_client(struct datalink_proto *proto)
{
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ static RCU_HEAD(snap_rcu);
- list_del(&proto->node);
- kfree(proto);
+ spin_lock_bh(&snap_lock);
+ list_del_rcu(&proto->node);
+ spin_unlock_bh(&snap_lock);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ call_rcu(&snap_rcu, (void (*)(void *)) kfree, proto);
}
MODULE_LICENSE("GPL");
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH] (1/8) Eliminate brlock in psnap
2003-03-12 0:14 ` [PATCH] (1/8) Eliminate brlock in psnap Stephen Hemminger
@ 2003-03-12 1:03 ` Roman Zippel
2003-03-12 1:13 ` Zwane Mwaikambo
1 sibling, 0 replies; 31+ messages in thread
From: Roman Zippel @ 2003-03-12 1:03 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Linus Torvalds, David Miller, Linux Kernel Mailing List,
linux-net
Hi,
On 11 Mar 2003, Stephen Hemminger wrote:
> void unregister_snap_client(struct datalink_proto *proto)
> {
> - br_write_lock_bh(BR_NETPROTO_LOCK);
> + static RCU_HEAD(snap_rcu);
>
> - list_del(&proto->node);
> - kfree(proto);
> + spin_lock_bh(&snap_lock);
> + list_del_rcu(&proto->node);
> + spin_unlock_bh(&snap_lock);
>
> - br_write_unlock_bh(BR_NETPROTO_LOCK);
> + call_rcu(&snap_rcu, (void (*)(void *)) kfree, proto);
> }
Is this really correct? What happens with snap_rcu, if
unregister_snap_client is called again, before the call_rcu callback
finished?
bye, Roman
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH] (1/8) Eliminate brlock in psnap
2003-03-12 0:14 ` [PATCH] (1/8) Eliminate brlock in psnap Stephen Hemminger
2003-03-12 1:03 ` Roman Zippel
@ 2003-03-12 1:13 ` Zwane Mwaikambo
1 sibling, 0 replies; 31+ messages in thread
From: Zwane Mwaikambo @ 2003-03-12 1:13 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Linus Torvalds, David Miller, Linux Kernel Mailing List,
linux-net
On Tue, 11 Mar 2003, Stephen Hemminger wrote:
> void unregister_snap_client(struct datalink_proto *proto)
> {
> - br_write_lock_bh(BR_NETPROTO_LOCK);
> + static RCU_HEAD(snap_rcu);
>
> - list_del(&proto->node);
> - kfree(proto);
> + spin_lock_bh(&snap_lock);
> + list_del_rcu(&proto->node);
> + spin_unlock_bh(&snap_lock);
>
> - br_write_unlock_bh(BR_NETPROTO_LOCK);
> + call_rcu(&snap_rcu, (void (*)(void *)) kfree, proto);
> }
Do we need the spin_lock_bh around the list_del_rcu? But also How
about. This way we don't change the previous characteristic of block till
done unregistering
struct datalink_proto {
...
struct completion registration;
};
void __unregister_snap_client(void *__proto)
{
struct datalink_proto *proto = __proto;
complete(&proto->registration);
}
unregister_snap_client(struct datalink_proto *proto)
{
list_del_rcu(&proto->node);
call_rcu(&snap_rcu, __unregister_snap_client, proto);
wait_for_completion(&proto->registration);
kfree(proto);
}
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH] (2/8) Eliminate brlock for packet_type
2003-03-10 2:33 ` Linus Torvalds
` (2 preceding siblings ...)
2003-03-12 0:14 ` [PATCH] (1/8) Eliminate brlock in psnap Stephen Hemminger
@ 2003-03-12 0:14 ` Stephen Hemminger
2003-03-12 0:20 ` David S. Miller
2003-03-12 0:14 ` [PATCH] (3/8) Eliminate brlock from vlan Stephen Hemminger
` (5 subsequent siblings)
9 siblings, 1 reply; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:14 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: linux-net, Linux Kernel Mailing List
Replace linked list for packet_type with brlock with list macros and RCU.
diff -urN -X dontdiff linux-2.5.64/net/8021q/vlan.c linux-2.5-nobrlock/net/8021q/vlan.c
--- linux-2.5.64/net/8021q/vlan.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/8021q/vlan.c 2003-03-11 14:31:28.000000000 -0800
@@ -29,7 +29,6 @@
#include <net/p8022.h>
#include <net/arp.h>
#include <linux/rtnetlink.h>
-#include <linux/brlock.h>
#include <linux/notifier.h>
#include <linux/if_vlan.h>
@@ -68,7 +67,6 @@
.dev =NULL,
.func = vlan_skb_recv, /* VLAN receive method */
.data = (void *)(-1), /* Set here '(void *)1' when this code can SHARE SKBs */
- .next = NULL
};
/* End of global variables definitions. */
@@ -231,9 +229,10 @@
real_dev->vlan_rx_kill_vid(real_dev, vlan_id);
}
- br_write_lock(BR_NETPROTO_LOCK);
grp->vlan_devices[vlan_id] = NULL;
- br_write_unlock(BR_NETPROTO_LOCK);
+
+ /* wait for RCU in network receive */
+ synchronize_kernel();
/* Caller unregisters (and if necessary, puts)
diff -urN -X dontdiff linux-2.5.64/net/8021q/vlan_dev.c linux-2.5-nobrlock/net/8021q/vlan_dev.c
--- linux-2.5.64/net/8021q/vlan_dev.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/8021q/vlan_dev.c 2003-03-10 15:49:52.000000000 -0800
@@ -31,7 +31,6 @@
#include <net/datalink.h>
#include <net/p8022.h>
#include <net/arp.h>
-#include <linux/brlock.h>
#include "vlan.h"
#include "vlanproc.h"
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH] (2/8) Eliminate brlock for packet_type
2003-03-12 0:14 ` [PATCH] (2/8) Eliminate brlock for packet_type Stephen Hemminger
@ 2003-03-12 0:20 ` David S. Miller
2003-03-12 0:25 ` Stephen Hemminger
0 siblings, 1 reply; 31+ messages in thread
From: David S. Miller @ 2003-03-12 0:20 UTC (permalink / raw)
To: shemminger; +Cc: torvalds, linux-net, linux-kernel
From: Stephen Hemminger <shemminger@osdl.org>
Date: 11 Mar 2003 16:14:40 -0800
Replace linked list for packet_type with brlock with list macros and RCU.
Then why is a VLAN patch attached?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (2/8) Eliminate brlock for packet_type
2003-03-12 0:20 ` David S. Miller
@ 2003-03-12 0:25 ` Stephen Hemminger
0 siblings, 0 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:25 UTC (permalink / raw)
To: David S. Miller; +Cc: Linus Torvalds, linux-net, Linux Kernel Mailing List
On Tue, 2003-03-11 at 16:20, David S. Miller wrote:
> From: Stephen Hemminger <shemminger@osdl.org>
> Date: 11 Mar 2003 16:14:40 -0800
>
> Replace linked list for packet_type with brlock with list macros and RCU.
>
> Then why is a VLAN patch attached?
Little pieces breed confusion...this is the right piece.
diff -urN -X dontdiff linux-2.5.64/include/linux/netdevice.h linux-2.5-nobrlock/include/linux/netdevice.h
--- linux-2.5.64/include/linux/netdevice.h 2003-03-11 09:08:00.000000000 -0800
+++ linux-2.5-nobrlock/include/linux/netdevice.h 2003-03-11 14:15:17.000000000 -0800
@@ -452,7 +452,7 @@
int (*func) (struct sk_buff *, struct net_device *,
struct packet_type *);
void *data; /* Private to the packet type */
- struct packet_type *next;
+ struct list_head packet_list;
};
diff -urN -X dontdiff linux-2.5.64/net/core/dev.c linux-2.5-nobrlock/net/core/dev.c
--- linux-2.5.64/net/core/dev.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/core/dev.c 2003-03-11 14:37:36.000000000 -0800
@@ -90,7 +90,6 @@
#include <linux/etherdevice.h>
#include <linux/notifier.h>
#include <linux/skbuff.h>
-#include <linux/brlock.h>
#include <net/sock.h>
#include <linux/rtnetlink.h>
#include <linux/proc_fs.h>
@@ -170,8 +169,11 @@
* 86DD IPv6
*/
-static struct packet_type *ptype_base[16]; /* 16 way hashed list */
-static struct packet_type *ptype_all; /* Taps */
+static spinlock_t ptype_lock = SPIN_LOCK_UNLOCKED;
+static struct list_head ptype_base[16]; /* 16 way hashed list */
+static struct list_head ptype_all; /* Taps */
+
+static spinlock_t master_lock = SPIN_LOCK_UNLOCKED;
#ifdef OFFLINE_SAMPLE
static void sample_queue(unsigned long dummy);
@@ -203,7 +205,6 @@
static struct subsystem net_subsys;
-
/*******************************************************************************
Protocol management and registration routines
@@ -245,8 +246,7 @@
{
int hash;
- br_write_lock_bh(BR_NETPROTO_LOCK);
-
+ spin_lock_bh(&ptype_lock);
#ifdef CONFIG_NET_FASTROUTE
/* Hack to detect packet socket */
if (pt->data && (long)(pt->data) != 1) {
@@ -256,14 +256,12 @@
#endif
if (pt->type == htons(ETH_P_ALL)) {
netdev_nit++;
- pt->next = ptype_all;
- ptype_all = pt;
+ list_add_rcu(&pt->packet_list, &ptype_all);
} else {
hash = ntohs(pt->type) & 15;
- pt->next = ptype_base[hash];
- ptype_base[hash] = pt;
+ list_add_rcu(&pt->packet_list, &ptype_base[hash]);
}
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&ptype_lock);
}
extern void linkwatch_run_queue(void);
@@ -279,29 +277,30 @@
*/
void dev_remove_pack(struct packet_type *pt)
{
- struct packet_type **pt1;
-
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ struct list_head *head, *pelem;
- if (pt->type == htons(ETH_P_ALL)) {
- netdev_nit--;
- pt1 = &ptype_all;
- } else
- pt1 = &ptype_base[ntohs(pt->type) & 15];
-
- for (; *pt1; pt1 = &((*pt1)->next)) {
- if (pt == *pt1) {
- *pt1 = pt->next;
+ if (pt->type == htons(ETH_P_ALL))
+ head = &ptype_all;
+ else
+ head = &ptype_base[ntohs(pt->type) & 15];
+
+ spin_lock_bh(&ptype_lock);
+ list_for_each(pelem, head) {
+ if (list_entry(pelem, struct packet_type, packet_list) == pt) {
+ list_del(pelem);
#ifdef CONFIG_NET_FASTROUTE
if (pt->data)
netdev_fastroute_obstacles--;
#endif
+ if (pt->type == htons(ETH_P_ALL))
+ netdev_nit--;
goto out;
}
}
printk(KERN_WARNING "dev_remove_pack: %p not found.\n", pt);
-out:
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+
+ out:
+ spin_unlock_bh(&ptype_lock);
}
/******************************************************************************
@@ -896,11 +895,13 @@
void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
{
- struct packet_type *ptype;
+ struct list_head *plist;
do_gettimeofday(&skb->stamp);
- br_read_lock(BR_NETPROTO_LOCK);
- for (ptype = ptype_all; ptype; ptype = ptype->next) {
+ rcu_read_lock();
+ list_for_each_rcu(plist, &ptype_all) {
+ struct packet_type *ptype
+ = list_entry(plist, struct packet_type, packet_list);
/* Never send packets back to the socket
* they originated from - MvS (miquels@drinkel.ow.org)
*/
@@ -930,7 +931,7 @@
ptype->func(skb2, skb->dev, ptype);
}
}
- br_read_unlock(BR_NETPROTO_LOCK);
+ rcu_read_unlock();
}
/* Calculate csum in the case, when packet is misrouted.
@@ -1423,6 +1424,7 @@
int netif_receive_skb(struct sk_buff *skb)
{
+ struct list_head *pcur;
struct packet_type *ptype, *pt_prev;
int ret = NET_RX_DROP;
unsigned short type = skb->protocol;
@@ -1443,8 +1445,10 @@
skb->h.raw = skb->nh.raw = skb->data;
+ rcu_read_lock();
pt_prev = NULL;
- for (ptype = ptype_all; ptype; ptype = ptype->next) {
+ list_for_each_rcu(pcur, &ptype_all) {
+ ptype = list_entry(pcur, struct packet_type, packet_list);
if (!ptype->dev || ptype->dev == skb->dev) {
if (pt_prev) {
if (!pt_prev->data) {
@@ -1476,7 +1480,9 @@
}
#endif
- for (ptype = ptype_base[ntohs(type) & 15]; ptype; ptype = ptype->next) {
+ list_for_each_rcu(pcur, &ptype_base[ntohs(type) & 15]) {
+ ptype = list_entry(pcur, struct packet_type, packet_list);
+
if (ptype->type == type &&
(!ptype->dev || ptype->dev == skb->dev)) {
if (pt_prev) {
@@ -1506,6 +1512,7 @@
*/
ret = NET_RX_DROP;
}
+ rcu_read_unlock();
return ret;
}
@@ -1580,7 +1587,7 @@
unsigned long start_time = jiffies;
int budget = netdev_max_backlog;
- br_read_lock(BR_NETPROTO_LOCK);
+ preempt_disable();
local_irq_disable();
while (!list_empty(&queue->poll_list)) {
@@ -1609,7 +1616,7 @@
}
out:
local_irq_enable();
- br_read_unlock(BR_NETPROTO_LOCK);
+ preempt_enable();
return;
softnet_break:
@@ -1925,6 +1932,10 @@
#define dev_proc_init() 0
#endif /* CONFIG_PROC_FS */
+static RCU_HEAD(netdev_master_rcu);
+static void unset_old_master(void *arg) {
+ dev_put((struct net_device *) arg);
+}
/**
* netdev_set_master - set up master/slave pair
@@ -1943,18 +1954,20 @@
ASSERT_RTNL();
+ spin_lock_bh(&master_lock);
if (master) {
- if (old)
+ if (old) {
+ spin_unlock_bh(&master_lock);
return -EBUSY;
+ }
dev_hold(master);
}
- br_write_lock_bh(BR_NETPROTO_LOCK);
slave->master = master;
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&master_lock);
- if (old)
- dev_put(old);
+ if (old)
+ call_rcu(&netdev_master_rcu, unset_old_master, old);
if (master)
slave->flags |= IFF_SLAVE;
@@ -1962,6 +1975,7 @@
slave->flags &= ~IFF_SLAVE;
rtmsg_ifinfo(RTM_NEWLINK, slave, IFF_SLAVE);
+
return 0;
}
@@ -2646,10 +2660,7 @@
return -ENODEV;
}
- /* Synchronize to net_rx_action. */
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
-
+ synchronize_kernel();
#ifdef CONFIG_NET_FASTROUTE
dev_clear_fastroute(dev);
@@ -2755,7 +2766,6 @@
return 0;
}
-
/*
* Initialize the DEV module. At boot time this walks the device list and
* unhooks any devices that fail to initialise (normally hardware not
@@ -2786,6 +2796,11 @@
if (dev_proc_init())
goto out;
+ /* Initialize packet type chains */
+ INIT_LIST_HEAD(&ptype_all);
+ for (i = 0; i < ARRAY_SIZE(ptype_base); i++)
+ INIT_LIST_HEAD(&ptype_base[i]);
+
subsystem_register(&net_subsys);
#ifdef CONFIG_NET_DIVERT
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH] (3/8) Eliminate brlock from vlan
2003-03-10 2:33 ` Linus Torvalds
` (3 preceding siblings ...)
2003-03-12 0:14 ` [PATCH] (2/8) Eliminate brlock for packet_type Stephen Hemminger
@ 2003-03-12 0:14 ` Stephen Hemminger
2003-03-12 0:14 ` [PATCH] (4/8) Eliminate brlock in net/bridge Stephen Hemminger
` (4 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:14 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
Eliminate brlock from vlan
diff -urN -X dontdiff linux-2.5.64/net/8021q/vlan.c linux-2.5-nobrlock/net/8021q/vlan.c
--- linux-2.5.64/net/8021q/vlan.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/8021q/vlan.c 2003-03-11 14:31:28.000000000 -0800
@@ -29,7 +29,6 @@
#include <net/p8022.h>
#include <net/arp.h>
#include <linux/rtnetlink.h>
-#include <linux/brlock.h>
#include <linux/notifier.h>
#include <linux/if_vlan.h>
@@ -68,7 +67,6 @@
.dev =NULL,
.func = vlan_skb_recv, /* VLAN receive method */
.data = (void *)(-1), /* Set here '(void *)1' when this code can SHARE SKBs */
- .next = NULL
};
/* End of global variables definitions. */
@@ -231,9 +229,10 @@
real_dev->vlan_rx_kill_vid(real_dev, vlan_id);
}
- br_write_lock(BR_NETPROTO_LOCK);
grp->vlan_devices[vlan_id] = NULL;
- br_write_unlock(BR_NETPROTO_LOCK);
+
+ /* wait for RCU in network receive */
+ synchronize_kernel();
/* Caller unregisters (and if necessary, puts)
diff -urN -X dontdiff linux-2.5.64/net/8021q/vlan_dev.c linux-2.5-nobrlock/net/8021q/vlan_dev.c
--- linux-2.5.64/net/8021q/vlan_dev.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/8021q/vlan_dev.c 2003-03-10 15:49:52.000000000 -0800
@@ -31,7 +31,6 @@
#include <net/datalink.h>
#include <net/p8022.h>
#include <net/arp.h>
-#include <linux/brlock.h>
#include "vlan.h"
#include "vlanproc.h"
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH] (4/8) Eliminate brlock in net/bridge
2003-03-10 2:33 ` Linus Torvalds
` (4 preceding siblings ...)
2003-03-12 0:14 ` [PATCH] (3/8) Eliminate brlock from vlan Stephen Hemminger
@ 2003-03-12 0:14 ` Stephen Hemminger
2003-03-12 0:14 ` [PATCH] (5/8) Eliminate brlock from netfilter Stephen Hemminger
` (3 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:14 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
Eliminate brlock in bridge code.
diff -urN -X dontdiff linux-2.5.64/net/bridge/br.c linux-2.5-nobrlock/net/bridge/br.c
--- linux-2.5.64/net/bridge/br.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/bridge/br.c 2003-03-10 15:49:15.000000000 -0800
@@ -21,7 +21,6 @@
#include <linux/etherdevice.h>
#include <linux/init.h>
#include <linux/if_bridge.h>
-#include <linux/brlock.h>
#include <asm/uaccess.h>
#include "br_private.h"
@@ -73,14 +72,15 @@
unregister_netdevice_notifier(&br_device_notifier);
br_call_ioctl_atomic(__br_clear_ioctl_hook);
- br_write_lock_bh(BR_NETPROTO_LOCK);
br_handle_frame_hook = NULL;
- br_write_unlock_bh(BR_NETPROTO_LOCK);
#if defined(CONFIG_ATM_LANE) || defined(CONFIG_ATM_LANE_MODULE)
br_fdb_get_hook = NULL;
br_fdb_put_hook = NULL;
#endif
+
+ /* netif_receive_skb uses RCU */
+ synchronize_kernel();
}
EXPORT_SYMBOL(br_should_route_hook);
diff -urN -X dontdiff linux-2.5.64/net/bridge/br_if.c linux-2.5-nobrlock/net/bridge/br_if.c
--- linux-2.5.64/net/bridge/br_if.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/bridge/br_if.c 2003-03-10 15:50:54.000000000 -0800
@@ -18,7 +18,6 @@
#include <linux/if_bridge.h>
#include <linux/inetdevice.h>
#include <linux/rtnetlink.h>
-#include <linux/brlock.h>
#include <asm/uaccess.h>
#include "br_private.h"
@@ -87,12 +86,11 @@
static void del_ifs(struct net_bridge *br)
{
- br_write_lock_bh(BR_NETPROTO_LOCK);
write_lock(&br->lock);
while (br->port_list != NULL)
__br_del_if(br, br->port_list->dev);
write_unlock(&br->lock);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
}
static struct net_bridge *new_nb(char *name)
@@ -257,13 +255,12 @@
{
int retval;
- br_write_lock_bh(BR_NETPROTO_LOCK);
write_lock(&br->lock);
retval = __br_del_if(br, dev);
br_stp_recalculate_bridge_id(br);
write_unlock(&br->lock);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
return retval;
}
diff -urN -X dontdiff linux-2.5.64/net/bridge/netfilter/ebtable_broute.c linux-2.5-nobrlock/net/bridge/netfilter/ebtable_broute.c
--- linux-2.5.64/net/bridge/netfilter/ebtable_broute.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/bridge/netfilter/ebtable_broute.c 2003-03-11 15:04:04.000000000 -0800
@@ -14,7 +14,6 @@
#include <linux/netfilter_bridge/ebtables.h>
#include <linux/module.h>
#include <linux/if_bridge.h>
-#include <linux/brlock.h>
// EBT_ACCEPT means the frame will be bridged
// EBT_DROP means the frame will be routed
@@ -68,18 +67,17 @@
ret = ebt_register_table(&broute_table);
if (ret < 0)
return ret;
- br_write_lock_bh(BR_NETPROTO_LOCK);
// see br_input.c
br_should_route_hook = ebt_broute;
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
+
return ret;
}
static void __exit fini(void)
{
- br_write_lock_bh(BR_NETPROTO_LOCK);
br_should_route_hook = NULL;
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
ebt_unregister_table(&broute_table);
}
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH] (5/8) Eliminate brlock from netfilter
2003-03-10 2:33 ` Linus Torvalds
` (5 preceding siblings ...)
2003-03-12 0:14 ` [PATCH] (4/8) Eliminate brlock in net/bridge Stephen Hemminger
@ 2003-03-12 0:14 ` Stephen Hemminger
2003-03-12 1:25 ` Zwane Mwaikambo
2003-03-12 0:15 ` [PATCH] (6/8) Eliminate brlock from ipv4 Stephen Hemminger
` (2 subsequent siblings)
9 siblings, 1 reply; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:14 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
Replace brlock in netfilter code with RCU.
diff -urN -X dontdiff linux-2.5.64/net/core/netfilter.c linux-2.5-nobrlock/net/core/netfilter.c
--- linux-2.5.64/net/core/netfilter.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/core/netfilter.c 2003-03-11 14:39:38.000000000 -0800
@@ -19,7 +19,6 @@
#include <linux/interrupt.h>
#include <linux/if.h>
#include <linux/netdevice.h>
-#include <linux/brlock.h>
#include <linux/inetdevice.h>
#include <net/sock.h>
#include <net/route.h>
@@ -46,6 +45,7 @@
struct list_head nf_hooks[NPROTO][NF_MAX_HOOKS];
static LIST_HEAD(nf_sockopts);
+static spinlock_t nf_lock = SPIN_LOCK_UNLOCKED;
/*
* A queue handler may be registered for each protocol. Each is protected by
@@ -56,28 +56,31 @@
nf_queue_outfn_t outfn;
void *data;
} queue_handler[NPROTO];
+static spinlock_t queue_lock = SPIN_LOCK_UNLOCKED;
int nf_register_hook(struct nf_hook_ops *reg)
{
struct list_head *i;
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&nf_lock);
for (i = nf_hooks[reg->pf][reg->hooknum].next;
i != &nf_hooks[reg->pf][reg->hooknum];
i = i->next) {
if (reg->priority < ((struct nf_hook_ops *)i)->priority)
break;
}
- list_add(®->list, i->prev);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ list_add_rcu(®->list, i->prev);
+ spin_unlock_bh(&nf_lock);
return 0;
}
void nf_unregister_hook(struct nf_hook_ops *reg)
{
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&nf_lock);
list_del(®->list);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&nf_lock);
+
+ synchronize_kernel();
}
/* Do exclusive ranges overlap? */
@@ -344,7 +347,9 @@
int (*okfn)(struct sk_buff *),
int hook_thresh)
{
- for (*i = (*i)->next; *i != head; *i = (*i)->next) {
+
+ for (*i = (*i)->next; *i != head; *i = (*i)->next,
+ ({ read_barrier_depends(); 0;})) {
struct nf_hook_ops *elem = (struct nf_hook_ops *)*i;
if (hook_thresh > elem->priority)
@@ -381,15 +386,16 @@
{
int ret;
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&queue_lock);
if (queue_handler[pf].outfn)
ret = -EBUSY;
else {
- queue_handler[pf].outfn = outfn;
queue_handler[pf].data = data;
+ wmb();
+ queue_handler[pf].outfn = outfn;
ret = 0;
}
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&queue_lock);
return ret;
}
@@ -397,10 +403,14 @@
/* The caller must flush their queue before this */
int nf_unregister_queue_handler(int pf)
{
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&queue_lock);
queue_handler[pf].outfn = NULL;
+ wmb();
queue_handler[pf].data = NULL;
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&queue_lock);
+
+ synchronize_kernel();
+
return 0;
}
@@ -422,9 +432,10 @@
struct net_device *physoutdev = NULL;
#endif
+ rcu_read_lock();
if (!queue_handler[pf].outfn) {
kfree_skb(skb);
- return;
+ goto out;
}
info = kmalloc(sizeof(*info), GFP_ATOMIC);
@@ -433,7 +444,7 @@
printk(KERN_ERR "OOM queueing packet %p\n",
skb);
kfree_skb(skb);
- return;
+ goto out;
}
*info = (struct nf_info) {
@@ -463,8 +474,9 @@
#endif
kfree(info);
kfree_skb(skb);
- return;
}
+ out:
+ rcu_read_unlock();
}
int nf_hook_slow(int pf, unsigned int hook, struct sk_buff *skb,
@@ -491,7 +503,7 @@
}
/* We may already have this, but read-locks nest anyway */
- br_read_lock_bh(BR_NETPROTO_LOCK);
+ rcu_read_lock();
#ifdef CONFIG_NETFILTER_DEBUG
if (skb->nf_debug & (1 << hook)) {
@@ -520,7 +532,7 @@
break;
}
- br_read_unlock_bh(BR_NETPROTO_LOCK);
+ rcu_read_unlock();
return ret;
}
@@ -530,8 +542,7 @@
struct list_head *elem = &info->elem->list;
struct list_head *i;
- /* We don't have BR_NETPROTO_LOCK here */
- br_read_lock_bh(BR_NETPROTO_LOCK);
+ rcu_read_lock();
for (i = nf_hooks[info->pf][info->hook].next; i != elem; i = i->next) {
if (i == &nf_hooks[info->pf][info->hook]) {
/* The module which sent it to userspace is gone. */
@@ -569,7 +580,7 @@
kfree_skb(skb);
break;
}
- br_read_unlock_bh(BR_NETPROTO_LOCK);
+ rcu_read_unlock();
/* Release those devices we held, or Alexey will kill me. */
if (info->indev) dev_put(info->indev);
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH] (6/8) Eliminate brlock from ipv4
2003-03-10 2:33 ` Linus Torvalds
` (6 preceding siblings ...)
2003-03-12 0:14 ` [PATCH] (5/8) Eliminate brlock from netfilter Stephen Hemminger
@ 2003-03-12 0:15 ` Stephen Hemminger
2003-03-12 0:15 ` [PATCH] (7/8) Eliminate brlock from IPV6 Stephen Hemminger
2003-03-12 0:15 ` [PATCH] (8/8) Kill brlock Stephen Hemminger
9 siblings, 0 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:15 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
Replace brlock in IPV4 with RCU
diff -urN -X dontdiff linux-2.5.64/net/ipv4/af_inet.c linux-2.5-nobrlock/net/ipv4/af_inet.c
--- linux-2.5.64/net/ipv4/af_inet.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/af_inet.c 2003-03-10 16:03:13.000000000 -0800
@@ -94,7 +94,6 @@
#include <linux/inet.h>
#include <linux/igmp.h>
#include <linux/netdevice.h>
-#include <linux/brlock.h>
#include <net/ip.h>
#include <net/protocol.h>
#include <net/arp.h>
@@ -130,6 +129,7 @@
* build a new socket.
*/
struct list_head inetsw[SOCK_MAX];
+static spinlock_t inetsw_lock = SPIN_LOCK_UNLOCKED;
/* New destruction routine */
@@ -337,8 +337,8 @@
/* Look for the requested type/protocol pair. */
answer = NULL;
- br_read_lock_bh(BR_NETPROTO_LOCK);
- list_for_each(p, &inetsw[sock->type]) {
+ rcu_read_lock();
+ list_for_each_rcu(p, &inetsw[sock->type]) {
answer = list_entry(p, struct inet_protosw, list);
/* Check the non-wild match. */
@@ -356,7 +356,7 @@
}
answer = NULL;
}
- br_read_unlock_bh(BR_NETPROTO_LOCK);
+ rcu_read_unlock();
err = -ESOCKTNOSUPPORT;
if (!answer)
@@ -978,7 +978,7 @@
int protocol = p->protocol;
struct list_head *last_perm;
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&inetsw_lock);
if (p->type > SOCK_MAX)
goto out_illegal;
@@ -1007,9 +1007,9 @@
* non-permanent entry. This means that when we remove this entry, the
* system automatically returns to the old behavior.
*/
- list_add(&p->list, last_perm);
+ list_add_rcu(&p->list, last_perm);
out:
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&inetsw_lock);
return;
out_permanent:
@@ -1031,9 +1031,11 @@
"Attempt to unregister permanent protocol %d.\n",
p->protocol);
} else {
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&inetsw_lock);
list_del(&p->list);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&inetsw_lock);
+
+ synchronize_kernel();
}
}
diff -urN -X dontdiff linux-2.5.64/net/ipv4/ip_output.c linux-2.5-nobrlock/net/ipv4/ip_output.c
--- linux-2.5.64/net/ipv4/ip_output.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/ip_output.c 2003-03-10 13:33:07.000000000 -0800
@@ -1261,11 +1261,10 @@
static struct packet_type ip_packet_type =
{
- __constant_htons(ETH_P_IP),
- NULL, /* All devices */
- ip_rcv,
- (void*)1,
- NULL,
+ .type = __constant_htons(ETH_P_IP),
+ .dev = NULL, /* All devices */
+ .func = ip_rcv,
+ .data = (void*)1,
};
/*
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_conntrack_core.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_conntrack_core.c
--- linux-2.5.64/net/ipv4/netfilter/ip_conntrack_core.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_conntrack_core.c 2003-03-11 14:36:31.000000000 -0800
@@ -24,7 +24,6 @@
#include <linux/skbuff.h>
#include <linux/proc_fs.h>
#include <linux/vmalloc.h>
-#include <linux/brlock.h>
#include <net/checksum.h>
#include <linux/stddef.h>
#include <linux/sysctl.h>
@@ -1160,8 +1159,7 @@
WRITE_UNLOCK(&ip_conntrack_lock);
/* Someone could be still looking at the helper in a bh. */
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
}
/* Refresh conntrack for this many jiffies. */
@@ -1402,8 +1400,7 @@
/* This makes sure all current packets have passed through
netfilter framework. Roll on, two-stage module
delete... */
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
i_see_dead_people:
ip_ct_selective_cleanup(kill_all, NULL);
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_conntrack_standalone.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_conntrack_standalone.c
--- linux-2.5.64/net/ipv4/netfilter/ip_conntrack_standalone.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_conntrack_standalone.c 2003-03-11 14:41:17.000000000 -0800
@@ -15,7 +15,6 @@
#include <linux/skbuff.h>
#include <linux/proc_fs.h>
#include <linux/version.h>
-#include <linux/brlock.h>
#include <net/checksum.h>
#define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_conntrack_lock)
@@ -325,8 +324,7 @@
WRITE_UNLOCK(&ip_conntrack_lock);
/* Somebody could be still looking at the proto in bh. */
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
/* Remove all contrack entries for this protocol */
ip_ct_selective_cleanup(kill_proto, &proto->proto);
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_nat_core.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_core.c
--- linux-2.5.64/net/ipv4/netfilter/ip_nat_core.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_core.c 2003-03-11 14:37:51.000000000 -0800
@@ -8,7 +8,6 @@
#include <linux/timer.h>
#include <linux/skbuff.h>
#include <linux/netfilter_ipv4.h>
-#include <linux/brlock.h>
#include <linux/vmalloc.h>
#include <net/checksum.h>
#include <net/icmp.h>
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_nat_helper.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_helper.c
--- linux-2.5.64/net/ipv4/netfilter/ip_nat_helper.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_helper.c 2003-03-11 14:38:23.000000000 -0800
@@ -20,7 +20,6 @@
#include <linux/timer.h>
#include <linux/skbuff.h>
#include <linux/netfilter_ipv4.h>
-#include <linux/brlock.h>
#include <net/checksum.h>
#include <net/icmp.h>
#include <net/ip.h>
@@ -545,8 +544,7 @@
WRITE_UNLOCK(&ip_nat_lock);
/* Someone could be still looking at the helper in a bh. */
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
/* Find anything using it, and umm, kill them. We can't turn
them into normal connections: if we've adjusted SYNs, then
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_nat_snmp_basic.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_snmp_basic.c
--- linux-2.5.64/net/ipv4/netfilter/ip_nat_snmp_basic.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_snmp_basic.c 2003-03-11 15:03:15.000000000 -0800
@@ -50,7 +50,6 @@
#include <linux/netfilter_ipv4.h>
#include <linux/netfilter_ipv4/ip_nat.h>
#include <linux/netfilter_ipv4/ip_nat_helper.h>
-#include <linux/brlock.h>
#include <linux/types.h>
#include <linux/ip.h>
#include <net/udp.h>
@@ -1351,8 +1350,6 @@
{
ip_nat_helper_unregister(&snmp);
ip_nat_helper_unregister(&snmp_trap);
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
}
module_init(init);
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_nat_standalone.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_standalone.c
--- linux-2.5.64/net/ipv4/netfilter/ip_nat_standalone.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_nat_standalone.c 2003-03-11 15:05:05.000000000 -0800
@@ -24,7 +24,6 @@
#include <net/checksum.h>
#include <linux/spinlock.h>
#include <linux/version.h>
-#include <linux/brlock.h>
#define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_nat_lock)
#define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_nat_lock)
@@ -268,8 +267,7 @@
WRITE_UNLOCK(&ip_nat_lock);
/* Someone could be still looking at the proto in a bh. */
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
}
static int init_or_cleanup(int init)
diff -urN -X dontdiff linux-2.5.64/net/ipv4/netfilter/ip_queue.c linux-2.5-nobrlock/net/ipv4/netfilter/ip_queue.c
--- linux-2.5.64/net/ipv4/netfilter/ip_queue.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/netfilter/ip_queue.c 2003-03-11 14:40:45.000000000 -0800
@@ -23,7 +23,6 @@
#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netlink.h>
#include <linux/spinlock.h>
-#include <linux/brlock.h>
#include <linux/sysctl.h>
#include <linux/proc_fs.h>
#include <linux/security.h>
@@ -685,8 +684,7 @@
cleanup:
nf_unregister_queue_handler(PF_INET);
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+
ipq_flush(NF_DROP);
cleanup_sysctl:
diff -urN -X dontdiff linux-2.5.64/net/ipv4/protocol.c linux-2.5-nobrlock/net/ipv4/protocol.c
--- linux-2.5.64/net/ipv4/protocol.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv4/protocol.c 2003-03-11 14:42:32.000000000 -0800
@@ -37,7 +37,6 @@
#include <linux/inet.h>
#include <linux/netdevice.h>
#include <linux/timer.h>
-#include <linux/brlock.h>
#include <net/ip.h>
#include <net/protocol.h>
#include <net/tcp.h>
@@ -49,6 +48,7 @@
#include <linux/igmp.h>
struct inet_protocol *inet_protos[MAX_INET_PROTOS];
+static spinlock_t inet_proto_lock = SPIN_LOCK_UNLOCKED;
/*
* Add a protocol handler to the hash tables
@@ -60,16 +60,14 @@
hash = protocol & (MAX_INET_PROTOS - 1);
- br_write_lock_bh(BR_NETPROTO_LOCK);
-
+ spin_lock_bh(&inet_proto_lock);
if (inet_protos[hash]) {
ret = -1;
} else {
inet_protos[hash] = prot;
ret = 0;
}
-
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&inet_proto_lock);
return ret;
}
@@ -84,16 +82,15 @@
hash = protocol & (MAX_INET_PROTOS - 1);
- br_write_lock_bh(BR_NETPROTO_LOCK);
-
+ spin_lock_bh(&inet_proto_lock);
if (inet_protos[hash] == prot) {
inet_protos[hash] = NULL;
ret = 0;
} else {
ret = -1;
}
-
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&inet_proto_lock);
+ synchronize_kernel();
return ret;
}
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH] (7/8) Eliminate brlock from IPV6
2003-03-10 2:33 ` Linus Torvalds
` (7 preceding siblings ...)
2003-03-12 0:15 ` [PATCH] (6/8) Eliminate brlock from ipv4 Stephen Hemminger
@ 2003-03-12 0:15 ` Stephen Hemminger
2003-03-12 0:15 ` [PATCH] (8/8) Kill brlock Stephen Hemminger
9 siblings, 0 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:15 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
Replace brlock in IPV6 with RCU.
diff -urN -X dontdiff linux-2.5.64/net/ipv6/af_inet6.c linux-2.5-nobrlock/net/ipv6/af_inet6.c
--- linux-2.5.64/net/ipv6/af_inet6.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv6/af_inet6.c 2003-03-10 16:05:21.000000000 -0800
@@ -45,7 +45,6 @@
#include <linux/inet.h>
#include <linux/netdevice.h>
#include <linux/icmpv6.h>
-#include <linux/brlock.h>
#include <linux/smp_lock.h>
#include <net/ip.h>
@@ -101,6 +100,7 @@
* build a new socket.
*/
struct list_head inetsw6[SOCK_MAX];
+static spinlock_t inetsw6_lock = SPIN_LOCK_UNLOCKED;
static void inet6_sock_destruct(struct sock *sk)
{
@@ -161,8 +161,8 @@
/* Look for the requested type/protocol pair. */
answer = NULL;
- br_read_lock_bh(BR_NETPROTO_LOCK);
- list_for_each(p, &inetsw6[sock->type]) {
+ rcu_read_lock();
+ list_for_each_rcu(p, &inetsw6[sock->type]) {
answer = list_entry(p, struct inet_protosw, list);
/* Check the non-wild match. */
@@ -180,7 +180,7 @@
}
answer = NULL;
}
- br_read_unlock_bh(BR_NETPROTO_LOCK);
+ rcu_read_unlock();
if (!answer)
goto free_and_badtype;
@@ -571,8 +571,7 @@
int protocol = p->protocol;
struct list_head *last_perm;
- br_write_lock_bh(BR_NETPROTO_LOCK);
-
+ spin_lock_bh(&inetsw6_lock);
if (p->type > SOCK_MAX)
goto out_illegal;
@@ -602,7 +601,7 @@
*/
list_add(&p->list, last_perm);
out:
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&inetsw6_lock);
return;
out_permanent:
diff -urN -X dontdiff linux-2.5.64/net/ipv6/ipv6_sockglue.c linux-2.5-nobrlock/net/ipv6/ipv6_sockglue.c
--- linux-2.5.64/net/ipv6/ipv6_sockglue.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv6/ipv6_sockglue.c 2003-03-10 13:33:07.000000000 -0800
@@ -54,11 +54,10 @@
static struct packet_type ipv6_packet_type =
{
- __constant_htons(ETH_P_IPV6),
- NULL, /* All devices */
- ipv6_rcv,
- (void*)1,
- NULL
+ .type = __constant_htons(ETH_P_IPV6),
+ .dev = NULL, /* All devices */
+ .func = ipv6_rcv,
+ .data = (void*)1,
};
/*
diff -urN -X dontdiff linux-2.5.64/net/ipv6/netfilter/ip6_queue.c linux-2.5-nobrlock/net/ipv6/netfilter/ip6_queue.c
--- linux-2.5.64/net/ipv6/netfilter/ip6_queue.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv6/netfilter/ip6_queue.c 2003-03-11 15:05:18.000000000 -0800
@@ -26,7 +26,6 @@
#include <linux/netfilter.h>
#include <linux/netlink.h>
#include <linux/spinlock.h>
-#include <linux/brlock.h>
#include <linux/sysctl.h>
#include <linux/proc_fs.h>
#include <net/sock.h>
@@ -688,8 +687,6 @@
cleanup:
nf_unregister_queue_handler(PF_INET6);
- br_write_lock_bh(BR_NETPROTO_LOCK);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
ipq_flush(NF_DROP);
cleanup_sysctl:
diff -urN -X dontdiff linux-2.5.64/net/ipv6/protocol.c linux-2.5-nobrlock/net/ipv6/protocol.c
--- linux-2.5.64/net/ipv6/protocol.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/net/ipv6/protocol.c 2003-03-11 14:37:06.000000000 -0800
@@ -32,7 +32,6 @@
#include <linux/in6.h>
#include <linux/netdevice.h>
#include <linux/if_arp.h>
-#include <linux/brlock.h>
#include <net/sock.h>
#include <net/snmp.h>
@@ -41,21 +40,21 @@
#include <net/protocol.h>
struct inet6_protocol *inet6_protos[MAX_INET_PROTOS];
+static spinlock_t inet6_proto_lock = SPIN_LOCK_UNLOCKED;
int inet6_add_protocol(struct inet6_protocol *prot, unsigned char protocol)
{
int ret, hash = protocol & (MAX_INET_PROTOS - 1);
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&inet6_proto_lock);
if (inet6_protos[hash]) {
ret = -1;
} else {
inet6_protos[hash] = prot;
ret = 0;
}
-
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ spin_unlock_bh(&inet6_proto_lock);
return ret;
}
@@ -68,7 +67,7 @@
{
int ret, hash = protocol & (MAX_INET_PROTOS - 1);
- br_write_lock_bh(BR_NETPROTO_LOCK);
+ spin_lock_bh(&inet6_proto_lock);
if (inet6_protos[hash] != prot) {
ret = -1;
@@ -76,8 +75,9 @@
inet6_protos[hash] = NULL;
ret = 0;
}
+ spin_unlock_bh(&inet6_proto_lock);
- br_write_unlock_bh(BR_NETPROTO_LOCK);
+ synchronize_kernel();
return ret;
}
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH] (8/8) Kill brlock
2003-03-10 2:33 ` Linus Torvalds
` (8 preceding siblings ...)
2003-03-12 0:15 ` [PATCH] (7/8) Eliminate brlock from IPV6 Stephen Hemminger
@ 2003-03-12 0:15 ` Stephen Hemminger
2003-03-12 0:23 ` David S. Miller
9 siblings, 1 reply; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:15 UTC (permalink / raw)
To: Linus Torvalds, David Miller; +Cc: Linux Kernel Mailing List, linux-net
Previous patches killed all remaining uses of brlock so bye.
diff -urN -X dontdiff linux-2.5.64/include/linux/brlock.h linux-2.5-nobrlock/include/linux/brlock.h
--- linux-2.5.64/include/linux/brlock.h 2003-03-11 09:08:00.000000000 -0800
+++ linux-2.5-nobrlock/include/linux/brlock.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,222 +0,0 @@
-#ifndef __LINUX_BRLOCK_H
-#define __LINUX_BRLOCK_H
-
-/*
- * 'Big Reader' read-write spinlocks.
- *
- * super-fast read/write locks, with write-side penalty. The point
- * is to have a per-CPU read/write lock. Readers lock their CPU-local
- * readlock, writers must lock all locks to get write access. These
- * CPU-read-write locks are semantically identical to normal rwlocks.
- * Memory usage is higher as well. (NR_CPUS*L1_CACHE_BYTES bytes)
- *
- * The most important feature is that these spinlocks do not cause
- * cacheline ping-pong in the 'most readonly data' case.
- *
- * Copyright 2000, Ingo Molnar <mingo@redhat.com>
- *
- * Registry idea and naming [ crutial! :-) ] by:
- *
- * David S. Miller <davem@redhat.com>
- *
- * David has an implementation that doesn't use atomic operations in
- * the read branch via memory ordering tricks - i guess we need to
- * split this up into a per-arch thing? The atomicity issue is a
- * secondary item in profiles, at least on x86 platforms.
- *
- * The atomic op version overhead is indeed a big deal on
- * load-locked/store-conditional cpus (ALPHA/MIPS/PPC) and
- * compare-and-swap cpus (Sparc64). So we control which
- * implementation to use with a __BRLOCK_USE_ATOMICS define. -DaveM
- *
- */
-
-/* Register bigreader lock indices here. */
-enum brlock_indices {
- BR_NETPROTO_LOCK,
- __BR_END
-};
-
-#include <linux/config.h>
-
-#ifdef CONFIG_SMP
-
-#include <linux/cache.h>
-#include <linux/spinlock.h>
-
-#if defined(__i386__) || defined(__ia64__) || defined(__x86_64__)
-#define __BRLOCK_USE_ATOMICS
-#else
-#undef __BRLOCK_USE_ATOMICS
-#endif
-
-#ifdef __BRLOCK_USE_ATOMICS
-typedef rwlock_t brlock_read_lock_t;
-#else
-typedef unsigned int brlock_read_lock_t;
-#endif
-
-/*
- * align last allocated index to the next cacheline:
- */
-#define __BR_IDX_MAX \
- (((sizeof(brlock_read_lock_t)*__BR_END + SMP_CACHE_BYTES-1) & ~(SMP_CACHE_BYTES-1)) / sizeof(brlock_read_lock_t))
-
-extern brlock_read_lock_t __brlock_array[NR_CPUS][__BR_IDX_MAX];
-
-#ifndef __BRLOCK_USE_ATOMICS
-struct br_wrlock {
- spinlock_t lock;
-} __attribute__ ((__aligned__(SMP_CACHE_BYTES)));
-
-extern struct br_wrlock __br_write_locks[__BR_IDX_MAX];
-#endif
-
-extern void __br_lock_usage_bug (void);
-
-#ifdef __BRLOCK_USE_ATOMICS
-
-static inline void br_read_lock (enum brlock_indices idx)
-{
- /*
- * This causes a link-time bug message if an
- * invalid index is used:
- */
- if (idx >= __BR_END)
- __br_lock_usage_bug();
-
- preempt_disable();
- _raw_read_lock(&__brlock_array[smp_processor_id()][idx]);
-}
-
-static inline void br_read_unlock (enum brlock_indices idx)
-{
- if (idx >= __BR_END)
- __br_lock_usage_bug();
-
- read_unlock(&__brlock_array[smp_processor_id()][idx]);
-}
-
-#else /* ! __BRLOCK_USE_ATOMICS */
-static inline void br_read_lock (enum brlock_indices idx)
-{
- unsigned int *ctr;
- spinlock_t *lock;
-
- /*
- * This causes a link-time bug message if an
- * invalid index is used:
- */
- if (idx >= __BR_END)
- __br_lock_usage_bug();
-
- preempt_disable();
- ctr = &__brlock_array[smp_processor_id()][idx];
- lock = &__br_write_locks[idx].lock;
-again:
- (*ctr)++;
- mb();
- if (spin_is_locked(lock)) {
- (*ctr)--;
- wmb(); /*
- * The release of the ctr must become visible
- * to the other cpus eventually thus wmb(),
- * we don't care if spin_is_locked is reordered
- * before the releasing of the ctr.
- * However IMHO this wmb() is superflous even in theory.
- * It would not be superflous only if on the
- * other CPUs doing a ldl_l instead of an ldl
- * would make a difference and I don't think this is
- * the case.
- * I'd like to clarify this issue further
- * but for now this is a slow path so adding the
- * wmb() will keep us on the safe side.
- */
- while (spin_is_locked(lock))
- barrier();
- goto again;
- }
-}
-
-static inline void br_read_unlock (enum brlock_indices idx)
-{
- unsigned int *ctr;
-
- if (idx >= __BR_END)
- __br_lock_usage_bug();
-
- ctr = &__brlock_array[smp_processor_id()][idx];
-
- wmb();
- (*ctr)--;
- preempt_enable();
-}
-#endif /* __BRLOCK_USE_ATOMICS */
-
-/* write path not inlined - it's rare and larger */
-
-extern void FASTCALL(__br_write_lock (enum brlock_indices idx));
-extern void FASTCALL(__br_write_unlock (enum brlock_indices idx));
-
-static inline void br_write_lock (enum brlock_indices idx)
-{
- if (idx >= __BR_END)
- __br_lock_usage_bug();
- __br_write_lock(idx);
-}
-
-static inline void br_write_unlock (enum brlock_indices idx)
-{
- if (idx >= __BR_END)
- __br_lock_usage_bug();
- __br_write_unlock(idx);
-}
-
-#else
-# define br_read_lock(idx) ({ (void)(idx); preempt_disable(); })
-# define br_read_unlock(idx) ({ (void)(idx); preempt_enable(); })
-# define br_write_lock(idx) ({ (void)(idx); preempt_disable(); })
-# define br_write_unlock(idx) ({ (void)(idx); preempt_enable(); })
-#endif /* CONFIG_SMP */
-
-/*
- * Now enumerate all of the possible sw/hw IRQ protected
- * versions of the interfaces.
- */
-#define br_read_lock_irqsave(idx, flags) \
- do { local_irq_save(flags); br_read_lock(idx); } while (0)
-
-#define br_read_lock_irq(idx) \
- do { local_irq_disable(); br_read_lock(idx); } while (0)
-
-#define br_read_lock_bh(idx) \
- do { local_bh_disable(); br_read_lock(idx); } while (0)
-
-#define br_write_lock_irqsave(idx, flags) \
- do { local_irq_save(flags); br_write_lock(idx); } while (0)
-
-#define br_write_lock_irq(idx) \
- do { local_irq_disable(); br_write_lock(idx); } while (0)
-
-#define br_write_lock_bh(idx) \
- do { local_bh_disable(); br_write_lock(idx); } while (0)
-
-#define br_read_unlock_irqrestore(idx, flags) \
- do { br_read_unlock(irx); local_irq_restore(flags); } while (0)
-
-#define br_read_unlock_irq(idx) \
- do { br_read_unlock(idx); local_irq_enable(); } while (0)
-
-#define br_read_unlock_bh(idx) \
- do { br_read_unlock(idx); local_bh_enable(); } while (0)
-
-#define br_write_unlock_irqrestore(idx, flags) \
- do { br_write_unlock(irx); local_irq_restore(flags); } while (0)
-
-#define br_write_unlock_irq(idx) \
- do { br_write_unlock(idx); local_irq_enable(); } while (0)
-
-#define br_write_unlock_bh(idx) \
- do { br_write_unlock(idx); local_bh_enable(); } while (0)
-
-#endif /* __LINUX_BRLOCK_H */
diff -urN -X dontdiff linux-2.5.64/kernel/ksyms.c linux-2.5-nobrlock/kernel/ksyms.c
--- linux-2.5.64/kernel/ksyms.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/kernel/ksyms.c 2003-03-10 15:50:15.000000000 -0800
@@ -40,7 +40,6 @@
#include <linux/mm.h>
#include <linux/capability.h>
#include <linux/highuid.h>
-#include <linux/brlock.h>
#include <linux/fs.h>
#include <linux/uio.h>
#include <linux/tty.h>
@@ -433,17 +432,6 @@
#endif
EXPORT_SYMBOL(mod_timer);
-#ifdef CONFIG_SMP
-
-/* Big-Reader lock implementation */
-EXPORT_SYMBOL(__brlock_array);
-#ifndef __BRLOCK_USE_ATOMICS
-EXPORT_SYMBOL(__br_write_locks);
-#endif
-EXPORT_SYMBOL(__br_write_lock);
-EXPORT_SYMBOL(__br_write_unlock);
-#endif
-
#ifdef HAVE_DISABLE_HLT
EXPORT_SYMBOL(disable_hlt);
EXPORT_SYMBOL(enable_hlt);
diff -urN -X dontdiff linux-2.5.64/lib/brlock.c linux-2.5-nobrlock/lib/brlock.c
--- linux-2.5.64/lib/brlock.c 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/lib/brlock.c 1969-12-31 16:00:00.000000000 -0800
@@ -1,72 +0,0 @@
-/*
- *
- * linux/lib/brlock.c
- *
- * 'Big Reader' read-write spinlocks. See linux/brlock.h for details.
- *
- * Copyright 2000, Ingo Molnar <mingo@redhat.com>
- * Copyright 2000, David S. Miller <davem@redhat.com>
- */
-
-#include <linux/config.h>
-
-#ifdef CONFIG_SMP
-
-#include <linux/sched.h>
-#include <linux/brlock.h>
-
-#ifdef __BRLOCK_USE_ATOMICS
-
-brlock_read_lock_t __brlock_array[NR_CPUS][__BR_IDX_MAX] =
- { [0 ... NR_CPUS-1] = { [0 ... __BR_IDX_MAX-1] = RW_LOCK_UNLOCKED } };
-
-void __br_write_lock (enum brlock_indices idx)
-{
- int i;
-
- preempt_disable();
- for (i = 0; i < NR_CPUS; i++)
- _raw_write_lock(&__brlock_array[i][idx]);
-}
-
-void __br_write_unlock (enum brlock_indices idx)
-{
- int i;
-
- for (i = 0; i < NR_CPUS; i++)
- _raw_write_unlock(&__brlock_array[i][idx]);
- preempt_enable();
-}
-
-#else /* ! __BRLOCK_USE_ATOMICS */
-
-brlock_read_lock_t __brlock_array[NR_CPUS][__BR_IDX_MAX] =
- { [0 ... NR_CPUS-1] = { [0 ... __BR_IDX_MAX-1] = 0 } };
-
-struct br_wrlock __br_write_locks[__BR_IDX_MAX] =
- { [0 ... __BR_IDX_MAX-1] = { SPIN_LOCK_UNLOCKED } };
-
-void __br_write_lock (enum brlock_indices idx)
-{
- int i;
-
- preempt_disable();
-again:
- _raw_spin_lock(&__br_write_locks[idx].lock);
- for (i = 0; i < NR_CPUS; i++)
- if (__brlock_array[i][idx] != 0) {
- _raw_spin_unlock(&__br_write_locks[idx].lock);
- barrier();
- cpu_relax();
- goto again;
- }
-}
-
-void __br_write_unlock (enum brlock_indices idx)
-{
- spin_unlock(&__br_write_locks[idx].lock);
-}
-
-#endif /* __BRLOCK_USE_ATOMICS */
-
-#endif /* CONFIG_SMP */
diff -urN -X dontdiff linux-2.5.64/lib/Makefile linux-2.5-nobrlock/lib/Makefile
--- linux-2.5.64/lib/Makefile 2003-03-11 09:08:01.000000000 -0800
+++ linux-2.5-nobrlock/lib/Makefile 2003-03-10 15:38:10.000000000 -0800
@@ -8,7 +8,7 @@
L_TARGET := lib.a
-obj-y := errno.o ctype.o string.o vsprintf.o brlock.o cmdline.o \
+obj-y := errno.o ctype.o string.o vsprintf.o cmdline.o \
bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
kobject.o idr.o
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH] (8/8) Kill brlock
2003-03-12 0:15 ` [PATCH] (8/8) Kill brlock Stephen Hemminger
@ 2003-03-12 0:23 ` David S. Miller
2003-03-12 0:31 ` Stephen Hemminger
0 siblings, 1 reply; 31+ messages in thread
From: David S. Miller @ 2003-03-12 0:23 UTC (permalink / raw)
To: shemminger; +Cc: torvalds, linux-kernel, linux-net
From: Stephen Hemminger <shemminger@osdl.org>
Date: 11 Mar 2003 16:15:23 -0800
Previous patches killed all remaining uses of brlock so bye.
I'm all for this once patch 2/8 gets fixed up :-)
So what is the new way to say "stop all incoming packet
processing while I update data structure X"?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (8/8) Kill brlock
2003-03-12 0:23 ` David S. Miller
@ 2003-03-12 0:31 ` Stephen Hemminger
2003-03-12 0:34 ` David S. Miller
0 siblings, 1 reply; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:31 UTC (permalink / raw)
To: David S. Miller; +Cc: Linus Torvalds, Linux Kernel Mailing List, linux-net
On Tue, 2003-03-11 at 16:23, David S. Miller wrote:
> From: Stephen Hemminger <shemminger@osdl.org>
> Date: 11 Mar 2003 16:15:23 -0800
>
> Previous patches killed all remaining uses of brlock so bye.
>
> I'm all for this once patch 2/8 gets fixed up :-)
>
> So what is the new way to say "stop all incoming packet
> processing while I update data structure X"?
The caller's didn't need to stop packet processing, just remove their type
from a list or some other call back hook. A simple pointer update takes
care of removing a simple call back. The list delete rcu code takes care
of the memory barriers and doing the updates in the right order.
This ensures no future packet processing will grab that token
The next problem is how to ensure that packets in flight are not using
the hook. This is handled by call_rcu() for the more general case where
processing can be deferred (only used a couple places). Other places
use synchronize_kernel() which just waits. What call_rcu() does is make sure
that every processor has seen the change.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (8/8) Kill brlock
2003-03-12 0:31 ` Stephen Hemminger
@ 2003-03-12 0:34 ` David S. Miller
2003-03-12 0:44 ` Linus Torvalds
0 siblings, 1 reply; 31+ messages in thread
From: David S. Miller @ 2003-03-12 0:34 UTC (permalink / raw)
To: shemminger; +Cc: torvalds, linux-kernel, linux-net
From: Stephen Hemminger <shemminger@osdl.org>
Date: 11 Mar 2003 16:31:46 -0800
The caller's didn't need to stop packet processing, just remove their type
from a list or some other call back hook. A simple pointer update takes
care of removing a simple call back. The list delete rcu code takes care
of the memory barriers and doing the updates in the right order.
This ensures no future packet processing will grab that token
Ok, I'm fine with this then. Linus you can apply all of his patches.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (8/8) Kill brlock
2003-03-12 0:34 ` David S. Miller
@ 2003-03-12 0:44 ` Linus Torvalds
2003-03-12 0:48 ` Stephen Hemminger
2003-03-12 2:31 ` Alan Cox
0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2003-03-12 0:44 UTC (permalink / raw)
To: David S. Miller; +Cc: shemminger, linux-kernel, linux-net
On Tue, 11 Mar 2003, David S. Miller wrote:
>
> Ok, I'm fine with this then. Linus you can apply all of his patches.
I'm a lazy bum, and I would _really_ want this tested more before it hits
my tree. I think it makes sense, but still..
Linus
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (8/8) Kill brlock
2003-03-12 0:44 ` Linus Torvalds
@ 2003-03-12 0:48 ` Stephen Hemminger
2003-03-12 2:31 ` Alan Cox
1 sibling, 0 replies; 31+ messages in thread
From: Stephen Hemminger @ 2003-03-12 0:48 UTC (permalink / raw)
To: Linus Torvalds; +Cc: David S. Miller, Linux Kernel Mailing List, linux-net
On Tue, 2003-03-11 at 16:44, Linus Torvalds wrote:
> On Tue, 11 Mar 2003, David S. Miller wrote:
> >
> > Ok, I'm fine with this then. Linus you can apply all of his patches.
>
> I'm a lazy bum, and I would _really_ want this tested more before it hits
> my tree. I think it makes sense, but still..
I agree, given the number of sub-pieces it deserves some more time...
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] (8/8) Kill brlock
2003-03-12 0:44 ` Linus Torvalds
2003-03-12 0:48 ` Stephen Hemminger
@ 2003-03-12 2:31 ` Alan Cox
1 sibling, 0 replies; 31+ messages in thread
From: Alan Cox @ 2003-03-12 2:31 UTC (permalink / raw)
To: Linus Torvalds
Cc: David S. Miller, shemminger, Linux Kernel Mailing List, linux-net
On Wed, 2003-03-12 at 00:44, Linus Torvalds wrote:
> On Tue, 11 Mar 2003, David S. Miller wrote:
> >
> > Ok, I'm fine with this then. Linus you can apply all of his patches.
>
> I'm a lazy bum, and I would _really_ want this tested more before it hits
> my tree. I think it makes sense, but still..
If Linus is scared ;) then throw them at me for -ac by all means. Anyone
running -ac IDE test sets is brave enough to run rcu network code 8)
^ permalink raw reply [flat|nested] 31+ messages in thread