* [patch 00/19] mutex subsystem, -V11
@ 2006-01-03 10:06 Ingo Molnar
2006-01-03 15:07 ` David Howells
2006-01-03 15:55 ` David Howells
0 siblings, 2 replies; 8+ messages in thread
From: Ingo Molnar @ 2006-01-03 10:06 UTC (permalink / raw)
To: lkml
Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, Nicolas Pitre,
Jes Sorensen, Al Viro, Oleg Nesterov, David Howells, Alan Cox,
Christoph Hellwig, Andi Kleen, Russell King
this is version -V11 of the generic mutex subsystem, against v2.6.15.
It consists of the following 19 patches:
add-atomic-xchg.patch
add-function-typecheck.patch
mutex-generic-asm-implementations.patch
mutex-asm-mutex.h-i386.patch
mutex-asm-mutex.h-x86_64.patch
mutex-asm-mutex.h-arm.patch
mutex-arch-mutex-h.patch
mutex-core.patch
mutex-docs.patch
mutex-debug.patch
mutex-debug-more.patch
sem2mutex-xfs.patch
sem2mutex-vfs-i-sem.patch
sem2mutex-vfs-i-sem-more.patch
sem2mutex-simple-ones.patch
sem2completion-sx8.patch
sem2completion-cpu5wdt.patch
sem2completion-ide-gendev.patch
sem2completion-loop.patch
the patches should work fine on every Linux architecture. They can also
be downloaded from:
http://redhat.com/~mingo/generic-mutex-subsystem/
Changes since -V10:
38 files changed, 99 insertions(+), 94 deletions(-)
- export mutex_trylock() too (reported by Antti Salmela)
- fixed attribution From: lines
- finished the i_sem -> i_mutex conversion: fixed up all affected
documentation as well, and rarely used code. It builds fine under
allyesconfig.
- DECLARE_MUTEX -> DEFINE_MUTEX in cpufreq
- small cleanups
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 00/19] mutex subsystem, -V11
2006-01-03 10:06 [patch 00/19] mutex subsystem, -V11 Ingo Molnar
@ 2006-01-03 15:07 ` David Howells
2006-01-03 15:14 ` Arjan van de Ven
2006-01-03 15:55 ` David Howells
1 sibling, 1 reply; 8+ messages in thread
From: David Howells @ 2006-01-03 15:07 UTC (permalink / raw)
To: Ingo Molnar
Cc: lkml, Linus Torvalds, Andrew Morton, Arjan van de Ven,
Nicolas Pitre, Jes Sorensen, Al Viro, Oleg Nesterov,
David Howells, Alan Cox, Christoph Hellwig, Andi Kleen,
Russell King
Ingo Molnar <mingo@elte.hu> wrote:
> this is version -V11 of the generic mutex subsystem, against v2.6.15.
When compiling for x86 with no mutex debugging, I see:
(gdb) disas mutex_lock
Dump of assembler code for function mutex_lock:
0xc02950d0 <mutex_lock+0>: lock decl (%eax)
0xc02950d3 <mutex_lock+3>: js 0xc02950ef <.text.lock.mutex>
0xc02950d5 <mutex_lock+5>: ret
End of assembler dump.
(gdb) disas 0xc02950ef
Dump of assembler code for function .text.lock.mutex:
0xc02950ef <.text.lock.mutex+0>: call 0xc0294ffb <__mutex_lock_noinline>
0xc02950f4 <.text.lock.mutex+5>: jmp 0xc02950d5 <mutex_lock+5>
0xc02950f6 <.text.lock.mutex+7>: call 0xc029509f <__mutex_unlock_noinline>
0xc02950fb <.text.lock.mutex+12>: jmp 0xc02950db <mutex_unlock+5>
End of assembler dump.
Can you arrange .text.lock.mutex+0 here to just jump directly to
__mutex_lock_noinline? Otherwise we have an unnecessarily extended return
path.
You may not want to make the JS go directly there, but you could have that go
to a JMP to __mutex_lock_noinline rather than having a CALL followed by a JMP
back to a return instruction.
Admittedly, this may not be possible, as you're mixing up C and ASM, but it
would speed things up a little.
David
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 00/19] mutex subsystem, -V11
2006-01-03 15:07 ` David Howells
@ 2006-01-03 15:14 ` Arjan van de Ven
2006-01-05 13:44 ` David Howells
0 siblings, 1 reply; 8+ messages in thread
From: Arjan van de Ven @ 2006-01-03 15:14 UTC (permalink / raw)
To: David Howells
Cc: Ingo Molnar, lkml, Linus Torvalds, Andrew Morton, Nicolas Pitre,
Jes Sorensen, Al Viro, Oleg Nesterov, Alan Cox, Christoph Hellwig,
Andi Kleen, Russell King
On Tue, 2006-01-03 at 15:07 +0000, David Howells wrote:
> Ingo Molnar <mingo@elte.hu> wrote:
>
> > this is version -V11 of the generic mutex subsystem, against v2.6.15.
>
> When compiling for x86 with no mutex debugging, I see:
>
> (gdb) disas mutex_lock
> Dump of assembler code for function mutex_lock:
> 0xc02950d0 <mutex_lock+0>: lock decl (%eax)
> 0xc02950d3 <mutex_lock+3>: js 0xc02950ef <.text.lock.mutex>
> 0xc02950d5 <mutex_lock+5>: ret
> End of assembler dump.
> (gdb) disas 0xc02950ef
> Dump of assembler code for function .text.lock.mutex:
> 0xc02950ef <.text.lock.mutex+0>: call 0xc0294ffb <__mutex_lock_noinline>
> 0xc02950f4 <.text.lock.mutex+5>: jmp 0xc02950d5 <mutex_lock+5>
> 0xc02950f6 <.text.lock.mutex+7>: call 0xc029509f <__mutex_unlock_noinline>
> 0xc02950fb <.text.lock.mutex+12>: jmp 0xc02950db <mutex_unlock+5>
> End of assembler dump.
>
> Can you arrange .text.lock.mutex+0 here to just jump directly to
> __mutex_lock_noinline? Otherwise we have an unnecessarily extended return
> path.
jmp is free on x86. eg zero cycles. Any trickery is more likely to cost
because of doing unexpected things.
>
> You may not want to make the JS go directly there, but you could have that go
> to a JMP to __mutex_lock_noinline rather than having a CALL followed by a JMP
> back to a return instruction.
unbalanced call/ret pairs are REALLY expensive on x86. The current x86
processors all do branch prediction on the ret based on a special
internal call stack, breaking the symmetry is thus a branch prediction
miss, eg 40+ cycles
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 00/19] mutex subsystem, -V11
2006-01-03 10:06 [patch 00/19] mutex subsystem, -V11 Ingo Molnar
2006-01-03 15:07 ` David Howells
@ 2006-01-03 15:55 ` David Howells
2006-01-03 15:58 ` Ingo Molnar
1 sibling, 1 reply; 8+ messages in thread
From: David Howells @ 2006-01-03 15:55 UTC (permalink / raw)
To: Ingo Molnar
Cc: lkml, Linus Torvalds, Andrew Morton, Arjan van de Ven,
Nicolas Pitre, Jes Sorensen, Al Viro, Oleg Nesterov,
David Howells, Alan Cox, Christoph Hellwig, Andi Kleen,
Russell King
The attached patch adds a module for testing and benchmarking mutexes,
semaphores and R/W semaphores.
Using it is simple:
insmod synchro-test.ko <args>
It will exit with error ENOANO after running the tests and printing the
results to the kernel console log.
The available arguments are:
(*) mx=N
Start up to N mutex thrashing threads, where N is at most 20. All will
try and thrash the same mutex.
(*) sm=N
Start up to N counting semaphore thrashing threads, where N is at most
20. All will try and thrash the same semaphore.
(*) ism=M
Initialise the counting semaphore with M, where M is any positive
integer greater than zero. The default is 4.
(*) rd=N
(*) wr=O
(*) dg=P
Start up to N reader thrashing threads, O writer thrashing threads and
P downgrader thrashing threads, where N, O and P are at most 20
apiece. All will try and thrash the same read/write semaphore.
(*) elapse=N
Run the tests for N seconds. The default is 5.
(*) load=N
Each thread delays for N uS whilst holding the lock. The dfault is 0.
(*) do_sched=1
Each thread will call schedule if required after each iteration.
(*) v=1
Print more verbose information, including a thread iteration
distribution list.
The module should be enabled by turning on CONFIG_DEBUG_SYNCHRO_TEST to "m".
Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat -p1 mutex-debug-module-2615rc7.diff
Documentation/synchro-test.txt | 51 ++++
kernel/Makefile | 1
kernel/synchro-test.c | 486 +++++++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 14 +
4 files changed, 552 insertions(+)
diff -uNrp linux-2.6.15-rc7-mutex/Documentation/synchro-test.txt linux-2.6.15-rc7-mutex-slowopt/Documentation/synchro-test.txt
--- linux-2.6.15-rc7-mutex/Documentation/synchro-test.txt 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc7-mutex-slowopt/Documentation/synchro-test.txt 2006-01-03 15:45:26.000000000 +0000
@@ -0,0 +1,51 @@
+The synchro-test.ko module can be used for testing and benchmarking mutexes,
+semaphores and R/W semaphores.
+
+Using it is simple:
+
+ insmod synchro-test.ko <args>
+
+It will exit with error ENOANO after running the tests and printing the
+results to the kernel console log.
+
+The available arguments are:
+
+ (*) mx=N
+
+ Start up to N mutex thrashing threads, where N is at most 20. All will
+ try and thrash the same mutex.
+
+ (*) sm=N
+
+ Start up to N counting semaphore thrashing threads, where N is at most
+ 20. All will try and thrash the same semaphore.
+
+ (*) ism=M
+
+ Initialise the counting semaphore with M, where M is any positive
+ integer greater than zero. The default is 4.
+
+ (*) rd=N
+ (*) wr=O
+ (*) dg=P
+
+ Start up to N reader thrashing threads, O writer thrashing threads and
+ P downgrader thrashing threads, where N, O and P are at most 20
+ apiece. All will try and thrash the same read/write semaphore.
+
+ (*) elapse=N
+
+ Run the tests for N seconds. The default is 5.
+
+ (*) load=N
+
+ Each thread delays for N uS whilst holding the lock. The dfault is 0.
+
+ (*) do_sched=1
+
+ Each thread will call schedule if required after each iteration.
+
+ (*) v=1
+
+ Print more verbose information, including a thread iteration
+ distribution list.
diff -uNrp linux-2.6.15-rc7-mutex/kernel/Makefile linux-2.6.15-rc7-mutex-slowopt/kernel/Makefile
--- linux-2.6.15-rc7-mutex/kernel/Makefile 2006-01-03 14:40:06.000000000 +0000
+++ linux-2.6.15-rc7-mutex-slowopt/kernel/Makefile 2006-01-03 15:43:51.000000000 +0000
@@ -33,6 +33,7 @@ obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_SECCOMP) += seccomp.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
+obj-$(CONFIG_DEBUG_SYNCHRO_TEST) += synchro-test.o
ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
# According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff -uNrp linux-2.6.15-rc7-mutex/kernel/synchro-test.c linux-2.6.15-rc7-mutex-slowopt/kernel/synchro-test.c
--- linux-2.6.15-rc7-mutex/kernel/synchro-test.c 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc7-mutex-slowopt/kernel/synchro-test.c 2006-01-03 15:37:16.000000000 +0000
@@ -0,0 +1,486 @@
+/* synchro-test.c: run some threads to test the synchronisation primitives
+ *
+ * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * run as something like:
+ *
+ * insmod synchro-test.ko rd=2 wr=2
+ * insmod synchro-test.ko mx=1
+ * insmod synchro-test.ko sm=2 ism=1
+ * insmod synchro-test.ko sm=2 ism=2
+ */
+
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/moduleparam.h>
+#include <linux/stat.h>
+#include <linux/init.h>
+#include <asm/atomic.h>
+#include <linux/personality.h>
+#include <linux/smp_lock.h>
+#include <linux/delay.h>
+#include <linux/timer.h>
+#include <linux/completion.h>
+#include <linux/mutex.h>
+
+#define VALIDATE_OPERATORS 0
+
+static int nummx = 0;
+static int numsm = 0, seminit = 4;
+static int numrd = 0, numwr = 0, numdg = 0;
+static int elapse = 5, load = 0, do_sched = 0;
+static int verbose = 0;
+
+MODULE_AUTHOR("David Howells");
+MODULE_DESCRIPTION("Synchronisation primitive test demo");
+MODULE_LICENSE("GPL");
+
+module_param_named(v, verbose, int, 0);
+MODULE_PARM_DESC(verbose, "Verbosity");
+
+module_param_named(mx, nummx, int, 0);
+MODULE_PARM_DESC(nummx, "Number of mutex threads");
+
+module_param_named(sm, numsm, int, 0);
+MODULE_PARM_DESC(numsm, "Number of semaphore threads");
+
+module_param_named(ism, seminit, int, 0);
+MODULE_PARM_DESC(seminit, "Initial semaphore value");
+
+module_param_named(rd, numrd, int, 0);
+MODULE_PARM_DESC(numrd, "Number of reader threads");
+
+module_param_named(wr, numwr, int, 0);
+MODULE_PARM_DESC(numwr, "Number of writer threads");
+
+module_param_named(dg, numdg, int, 0);
+MODULE_PARM_DESC(numdg, "Number of downgrader threads");
+
+module_param(elapse, int, 0);
+MODULE_PARM_DESC(elapse, "Number of seconds to run for");
+
+module_param(load, int, 0);
+MODULE_PARM_DESC(load, "Length of load in uS");
+
+module_param(do_sched, int, 0);
+MODULE_PARM_DESC(do_sched, "True if each thread should schedule regularly");
+
+/* the semaphores under test */
+static struct mutex ____cacheline_aligned mutex;
+static struct semaphore ____cacheline_aligned sem;
+static struct rw_semaphore ____cacheline_aligned rwsem;
+
+static atomic_t ____cacheline_aligned do_stuff = ATOMIC_INIT(0);
+
+#if VALIDATE_OPERATORS
+static atomic_t ____cacheline_aligned mutexes = ATOMIC_INIT(0);
+static atomic_t ____cacheline_aligned semaphores = ATOMIC_INIT(0);
+static atomic_t ____cacheline_aligned readers = ATOMIC_INIT(0);
+static atomic_t ____cacheline_aligned writers = ATOMIC_INIT(0);
+#endif
+
+static unsigned int ____cacheline_aligned mutexes_taken[20];
+static unsigned int ____cacheline_aligned semaphores_taken[20];
+static unsigned int ____cacheline_aligned reads_taken[20];
+static unsigned int ____cacheline_aligned writes_taken[20];
+static unsigned int ____cacheline_aligned downgrades_taken[20];
+
+static struct completion ____cacheline_aligned mx_comp[20];
+static struct completion ____cacheline_aligned sm_comp[20];
+static struct completion ____cacheline_aligned rd_comp[20];
+static struct completion ____cacheline_aligned wr_comp[20];
+static struct completion ____cacheline_aligned dg_comp[20];
+
+static struct timer_list ____cacheline_aligned timer;
+
+#define ACCOUNT(var, N) var##_taken[N]++;
+
+#if VALIDATE_OPERATORS
+#define TRACK(var, dir) atomic_##dir(&(var))
+
+#define CHECK(var, cond, val) \
+do { \
+ int x = atomic_read(&(var)); \
+ if (unlikely(!(x cond (val)))) \
+ printk("check [%s %s %d, == %d] failed in %s\n", \
+ #var, #cond, (val), x, __func__); \
+} while (0)
+
+#else
+#define TRACK(var, dir) do {} while(0)
+#define CHECK(var, cond, val) do {} while(0)
+#endif
+
+static inline void do_mutex_lock(unsigned int N)
+{
+ mutex_lock(&mutex);
+
+ ACCOUNT(mutexes, N);
+ TRACK(mutexes, inc);
+ CHECK(mutexes, ==, 1);
+}
+
+static inline void do_mutex_unlock(unsigned int N)
+{
+ CHECK(mutexes, ==, 1);
+ TRACK(mutexes, dec);
+
+ mutex_unlock(&mutex);
+}
+
+static inline void do_down(unsigned int N)
+{
+ CHECK(mutexes, <, seminit);
+
+ down(&sem);
+
+ ACCOUNT(semaphores, N);
+ TRACK(semaphores, inc);
+}
+
+static inline void do_up(unsigned int N)
+{
+ CHECK(semaphores, >, 0);
+ TRACK(semaphores, dec);
+
+ up(&sem);
+}
+
+static inline void do_down_read(unsigned int N)
+{
+ down_read(&rwsem);
+
+ ACCOUNT(reads, N);
+ TRACK(readers, inc);
+ CHECK(readers, >, 0);
+ CHECK(writers, ==, 0);
+}
+
+static inline void do_up_read(unsigned int N)
+{
+ CHECK(readers, >, 0);
+ CHECK(writers, ==, 0);
+ TRACK(readers, dec);
+
+ up_read(&rwsem);
+}
+
+static inline void do_down_write(unsigned int N)
+{
+ down_write(&rwsem);
+
+ ACCOUNT(writes, N);
+ TRACK(writers, inc);
+ CHECK(writers, ==, 1);
+ CHECK(readers, ==, 0);
+}
+
+static inline void do_up_write(unsigned int N)
+{
+ CHECK(writers, ==, 1);
+ CHECK(readers, ==, 0);
+ TRACK(writers, dec);
+
+ up_write(&rwsem);
+}
+
+static inline void do_downgrade_write(unsigned int N)
+{
+ CHECK(writers, ==, 1);
+ CHECK(readers, ==, 0);
+ TRACK(writers, dec);
+ TRACK(readers, inc);
+
+ downgrade_write(&rwsem);
+
+ ACCOUNT(downgrades, N);
+}
+
+static inline void sched(void)
+{
+ if (do_sched)
+ schedule();
+}
+
+int mutexer(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Mutex%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_mutex_lock(N);
+ if (load)
+ udelay(load);
+ do_mutex_unlock(N);
+ sched();
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&mx_comp[N], 0);
+}
+
+int semaphorer(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Sem%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down(N);
+ if (load)
+ udelay(load);
+ do_up(N);
+ sched();
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&sm_comp[N], 0);
+}
+
+int reader(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Read%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down_read(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_up_read(N);
+ sched();
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&rd_comp[N], 0);
+}
+
+int writer(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Write%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down_write(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_up_write(N);
+ sched();
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&wr_comp[N], 0);
+}
+
+int downgrader(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Down%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down_write(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_downgrade_write(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_up_read(N);
+ sched();
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&dg_comp[N], 0);
+}
+
+static void stop_test(unsigned long dummy)
+{
+ atomic_set(&do_stuff, 0);
+}
+
+static unsigned int total(const char *what, unsigned int counts[], int num)
+{
+ unsigned int tot = 0, max = 0, min = UINT_MAX, zeros = 0, cnt;
+ int loop;
+
+ for (loop = 0; loop < num; loop++) {
+ cnt = counts[loop];
+
+ if (cnt == 0) {
+ zeros++;
+ min = 0;
+ continue;
+ }
+
+ tot += cnt;
+ if (tot > max)
+ max = tot;
+ if (tot < min)
+ min = tot;
+ }
+
+ if (verbose && tot > 0) {
+ printk("%s:", what);
+
+ for (loop = 0; loop < num; loop++) {
+ cnt = counts[loop];
+
+ if (cnt == 0)
+ printk(" zzz");
+ else
+ printk(" %d%%", cnt * 100 / tot);
+ }
+
+ printk("\n");
+ }
+
+ return tot;
+}
+
+/*****************************************************************************/
+/*
+ *
+ */
+static int __init do_tests(void)
+{
+ unsigned long loop;
+ unsigned int mutex_total, sem_total, rd_total, wr_total, dg_total;
+
+ if (nummx < 0 || nummx > 20 ||
+ numsm < 0 || numsm > 20 ||
+ numrd < 0 || numrd > 20 ||
+ numwr < 0 || numwr > 20 ||
+ numdg < 0 || numdg > 20 ||
+ seminit < 1 ||
+ elapse < 1
+ ) {
+ printk("Parameter out of range\n");
+ return -ERANGE;
+ }
+
+ if ((nummx | numsm | numrd | numwr | numdg) == 0) {
+ printk("Nothing to do\n");
+ return -EINVAL;
+ }
+
+ if (verbose)
+ printk("\nStarting synchronisation primitive tests...\n");
+
+ mutex_init(&mutex);
+ sema_init(&sem, seminit);
+ init_rwsem(&rwsem);
+ atomic_set(&do_stuff, 1);
+
+ /* kick off all the children */
+ for (loop = 0; loop < 20; loop++) {
+ if (loop < nummx) {
+ init_completion(&mx_comp[loop]);
+ kernel_thread(mutexer, (void *) loop, 0);
+ }
+
+ if (loop < numsm) {
+ init_completion(&sm_comp[loop]);
+ kernel_thread(semaphorer, (void *) loop, 0);
+ }
+
+ if (loop < numrd) {
+ init_completion(&rd_comp[loop]);
+ kernel_thread(reader, (void *) loop, 0);
+ }
+
+ if (loop < numwr) {
+ init_completion(&wr_comp[loop]);
+ kernel_thread(writer, (void *) loop, 0);
+ }
+
+ if (loop < numdg) {
+ init_completion(&dg_comp[loop]);
+ kernel_thread(downgrader, (void *) loop, 0);
+ }
+ }
+
+ /* set a stop timer */
+ init_timer(&timer);
+ timer.function = stop_test;
+ timer.expires = jiffies + elapse * HZ;
+ add_timer(&timer);
+
+ /* now wait until it's all done */
+ for (loop = 0; loop < nummx; loop++)
+ wait_for_completion(&mx_comp[loop]);
+
+ for (loop = 0; loop < numsm; loop++)
+ wait_for_completion(&sm_comp[loop]);
+
+ for (loop = 0; loop < numrd; loop++)
+ wait_for_completion(&rd_comp[loop]);
+
+ for (loop = 0; loop < numwr; loop++)
+ wait_for_completion(&wr_comp[loop]);
+
+ for (loop = 0; loop < numdg; loop++)
+ wait_for_completion(&dg_comp[loop]);
+
+ atomic_set(&do_stuff, 0);
+ del_timer(&timer);
+
+ if (mutex_is_locked(&mutex))
+ printk(KERN_ERR "Mutex is still locked!\n");
+
+ /* count up */
+ mutex_total = total("MTX", mutexes_taken, nummx);
+ sem_total = total("SEM", semaphores_taken, numsm);
+ rd_total = total("RD ", reads_taken, numrd);
+ wr_total = total("WR ", writes_taken, numwr);
+ dg_total = total("DG ", downgrades_taken, numdg);
+
+ /* print the results */
+ if (verbose) {
+ printk("mutexes taken: %u\n", mutex_total);
+ printk("semaphores taken: %u\n", sem_total);
+ printk("reads taken: %u\n", rd_total);
+ printk("writes taken: %u\n", wr_total);
+ printk("downgrades taken: %u\n", dg_total);
+ }
+ else {
+ printk("%3d %3d %3d %3d %3d %c %3d %9u %9u %9u %9u %9u\n",
+ nummx, numsm, numrd, numwr, numdg,
+ do_sched ? 's' : '-',
+ load,
+ mutex_total,
+ sem_total,
+ rd_total,
+ wr_total,
+ dg_total);
+ }
+
+ /* tell insmod to discard the module */
+ if (verbose)
+ printk("Tests complete\n");
+ return -ENOANO;
+
+} /* end do_tests() */
+
+module_init(do_tests);
diff -uNrp linux-2.6.15-rc7-mutex/lib/Kconfig.debug linux-2.6.15-rc7-mutex-slowopt/lib/Kconfig.debug
--- linux-2.6.15-rc7-mutex/lib/Kconfig.debug 2006-01-03 14:40:06.000000000 +0000
+++ linux-2.6.15-rc7-mutex-slowopt/lib/Kconfig.debug 2006-01-03 15:48:23.000000000 +0000
@@ -207,3 +207,17 @@ config RCU_TORTURE_TEST
at boot time (you probably don't).
Say M if you want the RCU torture tests to build as a module.
Say N if you are unsure.
+
+config DEBUG_SYNCHRO_TEST
+ tristate "Synchronisation primitive testing module"
+ depends on DEBUG_KERNEL
+ default n
+ help
+ This option provides a kernel module that can thrash the sleepable
+ synchronisation primitives (mutexes and semaphores).
+
+ You should say N or M here. Whilst the module can be built in, it's
+ not recommended as it requires module parameters supplying to get it
+ to do anything.
+
+ See Documentation/synchro-test.txt.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 00/19] mutex subsystem, -V11
2006-01-03 15:55 ` David Howells
@ 2006-01-03 15:58 ` Ingo Molnar
2006-01-03 16:16 ` [PATCH] Add synchronisation primitive testing module David Howells
0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2006-01-03 15:58 UTC (permalink / raw)
To: David Howells
Cc: lkml, Linus Torvalds, Andrew Morton, Arjan van de Ven,
Nicolas Pitre, Jes Sorensen, Al Viro, Oleg Nesterov, Alan Cox,
Christoph Hellwig, Andi Kleen, Russell King
* David Howells <dhowells@redhat.com> wrote:
> The attached patch adds a module for testing and benchmarking mutexes,
> semaphores and R/W semaphores.
thanks!
> (*) load=N
>
> Each thread delays for N uS whilst holding the lock. The dfault is 0.
could you possibly also add an option that delays a thread while _not_
holding the lock? The time spent not holding the lock is an important
parameter of mutex workloads too.
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Add synchronisation primitive testing module
2006-01-03 15:58 ` Ingo Molnar
@ 2006-01-03 16:16 ` David Howells
0 siblings, 0 replies; 8+ messages in thread
From: David Howells @ 2006-01-03 16:16 UTC (permalink / raw)
To: Ingo Molnar
Cc: David Howells, lkml, Linus Torvalds, Andrew Morton,
Arjan van de Ven, Nicolas Pitre, Jes Sorensen, Al Viro,
Oleg Nesterov, Alan Cox, Christoph Hellwig, Andi Kleen,
Russell King
The attached patch adds a module for testing and benchmarking mutexes,
semaphores and R/W semaphores.
Using it is simple:
insmod synchro-test.ko <args>
It will exit with error ENOANO after running the tests and printing the
results to the kernel console log.
The available arguments are:
(*) mx=N
Start up to N mutex thrashing threads, where N is at most 20. All will
try and thrash the same mutex.
(*) sm=N
Start up to N counting semaphore thrashing threads, where N is at most
20. All will try and thrash the same semaphore.
(*) ism=M
Initialise the counting semaphore with M, where M is any positive
integer greater than zero. The default is 4.
(*) rd=N
(*) wr=O
(*) dg=P
Start up to N reader thrashing threads, O writer thrashing threads and
P downgrader thrashing threads, where N, O and P are at most 20
apiece. All will try and thrash the same read/write semaphore.
(*) elapse=N
Run the tests for N seconds. The default is 5.
(*) load=N
Each thread delays for N uS whilst holding the lock. The dfault is 0.
(*) interval=N
Each thread delays for N uS whilst not holding the lock. The default
is 0.
(*) do_sched=1
Each thread will call schedule if required after each iteration.
(*) v=1
Print more verbose information, including a thread iteration
distribution list.
The module should be enabled by turning on CONFIG_DEBUG_SYNCHRO_TEST to "m".
Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat -p1 mutex-debug-module-2615rc7-2.diff
Documentation/synchro-test.txt | 59 ++++
kernel/Makefile | 1
kernel/synchro-test.c | 503 +++++++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 14 +
4 files changed, 577 insertions(+)
diff -uNrp linux-2.6.15-rc7-mutex/Documentation/synchro-test.txt linux-2.6.15-rc7-mutex-slowopt/Documentation/synchro-test.txt
--- linux-2.6.15-rc7-mutex/Documentation/synchro-test.txt 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc7-mutex-slowopt/Documentation/synchro-test.txt 2006-01-03 16:13:30.000000000 +0000
@@ -0,0 +1,59 @@
+The synchro-test.ko module can be used for testing and benchmarking mutexes,
+semaphores and R/W semaphores.
+
+The module is compiled by setting CONFIG_DEBUG_SYNCHRO_TEST to "m" when
+configuring the kernel.
+
+Using it is simple:
+
+ insmod synchro-test.ko <args>
+
+It will exit with error ENOANO after running the tests and printing the
+results to the kernel console log.
+
+The available arguments are:
+
+ (*) mx=N
+
+ Start up to N mutex thrashing threads, where N is at most 20. All will
+ try and thrash the same mutex.
+
+ (*) sm=N
+
+ Start up to N counting semaphore thrashing threads, where N is at most
+ 20. All will try and thrash the same semaphore.
+
+ (*) ism=M
+
+ Initialise the counting semaphore with M, where M is any positive
+ integer greater than zero. The default is 4.
+
+ (*) rd=N
+ (*) wr=O
+ (*) dg=P
+
+ Start up to N reader thrashing threads, O writer thrashing threads and
+ P downgrader thrashing threads, where N, O and P are at most 20
+ apiece. All will try and thrash the same read/write semaphore.
+
+ (*) elapse=N
+
+ Run the tests for N seconds. The default is 5.
+
+ (*) load=N
+
+ Each thread delays for N uS whilst holding the lock. The default is 0.
+
+ (*) interval=N
+
+ Each thread delays for N uS whilst not holding the lock. The default
+ is 0.
+
+ (*) do_sched=1
+
+ Each thread will call schedule if required after each iteration.
+
+ (*) v=1
+
+ Print more verbose information, including a thread iteration
+ distribution list.
diff -uNrp linux-2.6.15-rc7-mutex/kernel/Makefile linux-2.6.15-rc7-mutex-slowopt/kernel/Makefile
--- linux-2.6.15-rc7-mutex/kernel/Makefile 2006-01-03 14:40:06.000000000 +0000
+++ linux-2.6.15-rc7-mutex-slowopt/kernel/Makefile 2006-01-03 15:43:51.000000000 +0000
@@ -33,6 +33,7 @@ obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_SECCOMP) += seccomp.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
+obj-$(CONFIG_DEBUG_SYNCHRO_TEST) += synchro-test.o
ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
# According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff -uNrp linux-2.6.15-rc7-mutex/kernel/synchro-test.c linux-2.6.15-rc7-mutex-slowopt/kernel/synchro-test.c
--- linux-2.6.15-rc7-mutex/kernel/synchro-test.c 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc7-mutex-slowopt/kernel/synchro-test.c 2006-01-03 16:12:46.000000000 +0000
@@ -0,0 +1,503 @@
+/* synchro-test.c: run some threads to test the synchronisation primitives
+ *
+ * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * The module should be run as something like:
+ *
+ * insmod synchro-test.ko rd=2 wr=2
+ * insmod synchro-test.ko mx=1
+ * insmod synchro-test.ko sm=2 ism=1
+ * insmod synchro-test.ko sm=2 ism=2
+ *
+ * See Documentation/synchro-test.txt for more information.
+ */
+
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/moduleparam.h>
+#include <linux/stat.h>
+#include <linux/init.h>
+#include <asm/atomic.h>
+#include <linux/personality.h>
+#include <linux/smp_lock.h>
+#include <linux/delay.h>
+#include <linux/timer.h>
+#include <linux/completion.h>
+#include <linux/mutex.h>
+
+#define VALIDATE_OPERATORS 0
+
+static int nummx = 0;
+static int numsm = 0, seminit = 4;
+static int numrd = 0, numwr = 0, numdg = 0;
+static int elapse = 5, load = 0, do_sched = 0, interval = 0;
+static int verbose = 0;
+
+MODULE_AUTHOR("David Howells");
+MODULE_DESCRIPTION("Synchronisation primitive test demo");
+MODULE_LICENSE("GPL");
+
+module_param_named(v, verbose, int, 0);
+MODULE_PARM_DESC(verbose, "Verbosity");
+
+module_param_named(mx, nummx, int, 0);
+MODULE_PARM_DESC(nummx, "Number of mutex threads");
+
+module_param_named(sm, numsm, int, 0);
+MODULE_PARM_DESC(numsm, "Number of semaphore threads");
+
+module_param_named(ism, seminit, int, 0);
+MODULE_PARM_DESC(seminit, "Initial semaphore value");
+
+module_param_named(rd, numrd, int, 0);
+MODULE_PARM_DESC(numrd, "Number of reader threads");
+
+module_param_named(wr, numwr, int, 0);
+MODULE_PARM_DESC(numwr, "Number of writer threads");
+
+module_param_named(dg, numdg, int, 0);
+MODULE_PARM_DESC(numdg, "Number of downgrader threads");
+
+module_param(elapse, int, 0);
+MODULE_PARM_DESC(elapse, "Number of seconds to run for");
+
+module_param(load, int, 0);
+MODULE_PARM_DESC(load, "Length of load in uS");
+
+module_param(interval, int, 0);
+MODULE_PARM_DESC(interval, "Length of interval in uS before re-getting lock");
+
+module_param(do_sched, int, 0);
+MODULE_PARM_DESC(do_sched, "True if each thread should schedule regularly");
+
+/* the semaphores under test */
+static struct mutex ____cacheline_aligned mutex;
+static struct semaphore ____cacheline_aligned sem;
+static struct rw_semaphore ____cacheline_aligned rwsem;
+
+static atomic_t ____cacheline_aligned do_stuff = ATOMIC_INIT(0);
+
+#if VALIDATE_OPERATORS
+static atomic_t ____cacheline_aligned mutexes = ATOMIC_INIT(0);
+static atomic_t ____cacheline_aligned semaphores = ATOMIC_INIT(0);
+static atomic_t ____cacheline_aligned readers = ATOMIC_INIT(0);
+static atomic_t ____cacheline_aligned writers = ATOMIC_INIT(0);
+#endif
+
+static unsigned int ____cacheline_aligned mutexes_taken[20];
+static unsigned int ____cacheline_aligned semaphores_taken[20];
+static unsigned int ____cacheline_aligned reads_taken[20];
+static unsigned int ____cacheline_aligned writes_taken[20];
+static unsigned int ____cacheline_aligned downgrades_taken[20];
+
+static struct completion ____cacheline_aligned mx_comp[20];
+static struct completion ____cacheline_aligned sm_comp[20];
+static struct completion ____cacheline_aligned rd_comp[20];
+static struct completion ____cacheline_aligned wr_comp[20];
+static struct completion ____cacheline_aligned dg_comp[20];
+
+static struct timer_list ____cacheline_aligned timer;
+
+#define ACCOUNT(var, N) var##_taken[N]++;
+
+#if VALIDATE_OPERATORS
+#define TRACK(var, dir) atomic_##dir(&(var))
+
+#define CHECK(var, cond, val) \
+do { \
+ int x = atomic_read(&(var)); \
+ if (unlikely(!(x cond (val)))) \
+ printk("check [%s %s %d, == %d] failed in %s\n", \
+ #var, #cond, (val), x, __func__); \
+} while (0)
+
+#else
+#define TRACK(var, dir) do {} while(0)
+#define CHECK(var, cond, val) do {} while(0)
+#endif
+
+static inline void do_mutex_lock(unsigned int N)
+{
+ mutex_lock(&mutex);
+
+ ACCOUNT(mutexes, N);
+ TRACK(mutexes, inc);
+ CHECK(mutexes, ==, 1);
+}
+
+static inline void do_mutex_unlock(unsigned int N)
+{
+ CHECK(mutexes, ==, 1);
+ TRACK(mutexes, dec);
+
+ mutex_unlock(&mutex);
+}
+
+static inline void do_down(unsigned int N)
+{
+ CHECK(mutexes, <, seminit);
+
+ down(&sem);
+
+ ACCOUNT(semaphores, N);
+ TRACK(semaphores, inc);
+}
+
+static inline void do_up(unsigned int N)
+{
+ CHECK(semaphores, >, 0);
+ TRACK(semaphores, dec);
+
+ up(&sem);
+}
+
+static inline void do_down_read(unsigned int N)
+{
+ down_read(&rwsem);
+
+ ACCOUNT(reads, N);
+ TRACK(readers, inc);
+ CHECK(readers, >, 0);
+ CHECK(writers, ==, 0);
+}
+
+static inline void do_up_read(unsigned int N)
+{
+ CHECK(readers, >, 0);
+ CHECK(writers, ==, 0);
+ TRACK(readers, dec);
+
+ up_read(&rwsem);
+}
+
+static inline void do_down_write(unsigned int N)
+{
+ down_write(&rwsem);
+
+ ACCOUNT(writes, N);
+ TRACK(writers, inc);
+ CHECK(writers, ==, 1);
+ CHECK(readers, ==, 0);
+}
+
+static inline void do_up_write(unsigned int N)
+{
+ CHECK(writers, ==, 1);
+ CHECK(readers, ==, 0);
+ TRACK(writers, dec);
+
+ up_write(&rwsem);
+}
+
+static inline void do_downgrade_write(unsigned int N)
+{
+ CHECK(writers, ==, 1);
+ CHECK(readers, ==, 0);
+ TRACK(writers, dec);
+ TRACK(readers, inc);
+
+ downgrade_write(&rwsem);
+
+ ACCOUNT(downgrades, N);
+}
+
+static inline void sched(void)
+{
+ if (do_sched)
+ schedule();
+}
+
+int mutexer(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Mutex%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_mutex_lock(N);
+ if (load)
+ udelay(load);
+ do_mutex_unlock(N);
+ sched();
+ if (interval)
+ udelay(interval);
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&mx_comp[N], 0);
+}
+
+int semaphorer(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Sem%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down(N);
+ if (load)
+ udelay(load);
+ do_up(N);
+ sched();
+ if (interval)
+ udelay(interval);
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&sm_comp[N], 0);
+}
+
+int reader(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Read%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down_read(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_up_read(N);
+ sched();
+ if (interval)
+ udelay(interval);
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&rd_comp[N], 0);
+}
+
+int writer(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Write%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down_write(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_up_write(N);
+ sched();
+ if (interval)
+ udelay(interval);
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&wr_comp[N], 0);
+}
+
+int downgrader(void *arg)
+{
+ unsigned int N = (unsigned long) arg;
+
+ daemonize("Down%u", N);
+
+ while (atomic_read(&do_stuff)) {
+ do_down_write(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_downgrade_write(N);
+#ifdef LOAD_TEST
+ if (load)
+ udelay(load);
+#endif
+ do_up_read(N);
+ sched();
+ if (interval)
+ udelay(interval);
+ }
+
+ if (verbose >= 2)
+ printk("%s: done\n", current->comm);
+ complete_and_exit(&dg_comp[N], 0);
+}
+
+static void stop_test(unsigned long dummy)
+{
+ atomic_set(&do_stuff, 0);
+}
+
+static unsigned int total(const char *what, unsigned int counts[], int num)
+{
+ unsigned int tot = 0, max = 0, min = UINT_MAX, zeros = 0, cnt;
+ int loop;
+
+ for (loop = 0; loop < num; loop++) {
+ cnt = counts[loop];
+
+ if (cnt == 0) {
+ zeros++;
+ min = 0;
+ continue;
+ }
+
+ tot += cnt;
+ if (tot > max)
+ max = tot;
+ if (tot < min)
+ min = tot;
+ }
+
+ if (verbose && tot > 0) {
+ printk("%s:", what);
+
+ for (loop = 0; loop < num; loop++) {
+ cnt = counts[loop];
+
+ if (cnt == 0)
+ printk(" zzz");
+ else
+ printk(" %d%%", cnt * 100 / tot);
+ }
+
+ printk("\n");
+ }
+
+ return tot;
+}
+
+/*****************************************************************************/
+/*
+ *
+ */
+static int __init do_tests(void)
+{
+ unsigned long loop;
+ unsigned int mutex_total, sem_total, rd_total, wr_total, dg_total;
+
+ if (nummx < 0 || nummx > 20 ||
+ numsm < 0 || numsm > 20 ||
+ numrd < 0 || numrd > 20 ||
+ numwr < 0 || numwr > 20 ||
+ numdg < 0 || numdg > 20 ||
+ seminit < 1 ||
+ elapse < 1 ||
+ load < 0 || load > 999 ||
+ interval < 0 || interval > 999
+ ) {
+ printk("Parameter out of range\n");
+ return -ERANGE;
+ }
+
+ if ((nummx | numsm | numrd | numwr | numdg) == 0) {
+ printk("Nothing to do\n");
+ return -EINVAL;
+ }
+
+ if (verbose)
+ printk("\nStarting synchronisation primitive tests...\n");
+
+ mutex_init(&mutex);
+ sema_init(&sem, seminit);
+ init_rwsem(&rwsem);
+ atomic_set(&do_stuff, 1);
+
+ /* kick off all the children */
+ for (loop = 0; loop < 20; loop++) {
+ if (loop < nummx) {
+ init_completion(&mx_comp[loop]);
+ kernel_thread(mutexer, (void *) loop, 0);
+ }
+
+ if (loop < numsm) {
+ init_completion(&sm_comp[loop]);
+ kernel_thread(semaphorer, (void *) loop, 0);
+ }
+
+ if (loop < numrd) {
+ init_completion(&rd_comp[loop]);
+ kernel_thread(reader, (void *) loop, 0);
+ }
+
+ if (loop < numwr) {
+ init_completion(&wr_comp[loop]);
+ kernel_thread(writer, (void *) loop, 0);
+ }
+
+ if (loop < numdg) {
+ init_completion(&dg_comp[loop]);
+ kernel_thread(downgrader, (void *) loop, 0);
+ }
+ }
+
+ /* set a stop timer */
+ init_timer(&timer);
+ timer.function = stop_test;
+ timer.expires = jiffies + elapse * HZ;
+ add_timer(&timer);
+
+ /* now wait until it's all done */
+ for (loop = 0; loop < nummx; loop++)
+ wait_for_completion(&mx_comp[loop]);
+
+ for (loop = 0; loop < numsm; loop++)
+ wait_for_completion(&sm_comp[loop]);
+
+ for (loop = 0; loop < numrd; loop++)
+ wait_for_completion(&rd_comp[loop]);
+
+ for (loop = 0; loop < numwr; loop++)
+ wait_for_completion(&wr_comp[loop]);
+
+ for (loop = 0; loop < numdg; loop++)
+ wait_for_completion(&dg_comp[loop]);
+
+ atomic_set(&do_stuff, 0);
+ del_timer(&timer);
+
+ if (mutex_is_locked(&mutex))
+ printk(KERN_ERR "Mutex is still locked!\n");
+
+ /* count up */
+ mutex_total = total("MTX", mutexes_taken, nummx);
+ sem_total = total("SEM", semaphores_taken, numsm);
+ rd_total = total("RD ", reads_taken, numrd);
+ wr_total = total("WR ", writes_taken, numwr);
+ dg_total = total("DG ", downgrades_taken, numdg);
+
+ /* print the results */
+ if (verbose) {
+ printk("mutexes taken: %u\n", mutex_total);
+ printk("semaphores taken: %u\n", sem_total);
+ printk("reads taken: %u\n", rd_total);
+ printk("writes taken: %u\n", wr_total);
+ printk("downgrades taken: %u\n", dg_total);
+ }
+ else {
+ printk("%3d %3d %3d %3d %3d %c %3d %9u %9u %9u %9u %9u\n",
+ nummx, numsm, numrd, numwr, numdg,
+ do_sched ? 's' : '-',
+ load,
+ mutex_total,
+ sem_total,
+ rd_total,
+ wr_total,
+ dg_total);
+ }
+
+ /* tell insmod to discard the module */
+ if (verbose)
+ printk("Tests complete\n");
+ return -ENOANO;
+
+} /* end do_tests() */
+
+module_init(do_tests);
diff -uNrp linux-2.6.15-rc7-mutex/lib/Kconfig.debug linux-2.6.15-rc7-mutex-slowopt/lib/Kconfig.debug
--- linux-2.6.15-rc7-mutex/lib/Kconfig.debug 2006-01-03 14:40:06.000000000 +0000
+++ linux-2.6.15-rc7-mutex-slowopt/lib/Kconfig.debug 2006-01-03 15:48:23.000000000 +0000
@@ -207,3 +207,17 @@ config RCU_TORTURE_TEST
at boot time (you probably don't).
Say M if you want the RCU torture tests to build as a module.
Say N if you are unsure.
+
+config DEBUG_SYNCHRO_TEST
+ tristate "Synchronisation primitive testing module"
+ depends on DEBUG_KERNEL
+ default n
+ help
+ This option provides a kernel module that can thrash the sleepable
+ synchronisation primitives (mutexes and semaphores).
+
+ You should say N or M here. Whilst the module can be built in, it's
+ not recommended as it requires module parameters supplying to get it
+ to do anything.
+
+ See Documentation/synchro-test.txt.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 00/19] mutex subsystem, -V11
2006-01-03 15:14 ` Arjan van de Ven
@ 2006-01-05 13:44 ` David Howells
2006-01-05 16:48 ` Linus Torvalds
0 siblings, 1 reply; 8+ messages in thread
From: David Howells @ 2006-01-05 13:44 UTC (permalink / raw)
To: Arjan van de Ven
Cc: David Howells, Ingo Molnar, lkml, Linus Torvalds, Andrew Morton,
Nicolas Pitre, Jes Sorensen, Al Viro, Oleg Nesterov, Alan Cox,
Christoph Hellwig, Andi Kleen, Russell King
Arjan van de Ven <arjan@infradead.org> wrote:
> > Can you arrange .text.lock.mutex+0 here to just jump directly to
> > __mutex_lock_noinline? Otherwise we have an unnecessarily extended return
> > path.
>
> jmp is free on x86. eg zero cycles. Any trickery is more likely to cost
> because of doing unexpected things.
I'm talking about replacing:
<caller>: call
mutex_lock: lock decl
mutex_lock: js
.text.lock.mutex: call
__mutex_lock: ret
.text.lock.mutex: jmp
mutex_lock: ret
<caller>: ...
With:
<caller>: call
mutex_lock: lock decl
mutex_lock: js
.text.lock.mutex: jmp
__mutex_lock: ret
<caller>: ...
Or:
<caller>: call
mutex_lock: lock decl
mutex_lock: js
__mutex_lock: ret
<caller>: ...
This sort of thing is done by the compiler when it does tail-calling.
> > You may not want to make the JS go directly there, but you could have that
> > go to a JMP to __mutex_lock_noinline rather than having a CALL followed by
> > a JMP back to a return instruction.
>
> unbalanced call/ret pairs are REALLY expensive on x86. The current x86
> processors all do branch prediction on the ret based on a special
> internal call stack, breaking the symmetry is thus a branch prediction
> miss, eg 40+ cycles
In what way would this be unbalanced? You end up with one fewer CALL and one
fewer RET instruction. And why would the CPU think this is any different from
a function with a conditional branch in it? Eg:
<caller>: call
mutex_lock: lock decl
mutex_lock: js
mutex_lock: ret
<caller>: ...
The only route to __mutex_lock would be through mutex_lock...
Are there docs on this feature of the x86 anywhere?
David
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 00/19] mutex subsystem, -V11
2006-01-05 13:44 ` David Howells
@ 2006-01-05 16:48 ` Linus Torvalds
0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2006-01-05 16:48 UTC (permalink / raw)
To: David Howells
Cc: Arjan van de Ven, Ingo Molnar, lkml, Andrew Morton, Nicolas Pitre,
Jes Sorensen, Al Viro, Oleg Nesterov, Alan Cox, Christoph Hellwig,
Andi Kleen, Russell King
On Thu, 5 Jan 2006, David Howells wrote:
>
> This sort of thing is done by the compiler when it does tail-calling.
Yes. And it's nice even when unconditional branches are effectively free,
because it can avoid an unnecessary cache miss just to fetch the
unnecessary branch (which very much _can_ happen, since the failure
function will sleep).
Of course, the thing to look out for is to never get the call-return stack
messed up, but this kind of regular tail-call doesn't have that issue.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-01-05 16:49 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-03 10:06 [patch 00/19] mutex subsystem, -V11 Ingo Molnar
2006-01-03 15:07 ` David Howells
2006-01-03 15:14 ` Arjan van de Ven
2006-01-05 13:44 ` David Howells
2006-01-05 16:48 ` Linus Torvalds
2006-01-03 15:55 ` David Howells
2006-01-03 15:58 ` Ingo Molnar
2006-01-03 16:16 ` [PATCH] Add synchronisation primitive testing module David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox