public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] lock contention tracking -v2
@ 2007-05-23  9:57 Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 1/7] fix raw_spinlock_t vs lockdep Peter Zijlstra
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

Use the lockdep infrastructure to track lock contention and other lock
statistics.

It tracks lock contention events, and the first four unique call-sites that
encountered contention.

It also measures lock wait-time and hold-time in nanoseconds. The minimum and
maximum times are tracked, as well as a total (which together with the number
of event can give the avg).

All statistics are done per lock class, per write (exclusive state) and per read
(shared state). 

The statistics are collected per-cpu, so that the collection overhead is
minimized via having no global cachemisses.

This new lock statistics feature is independent of the lock dependency checking
traditionally done by lockdep; it just shares the lock tracking code. It is
also possible to enable both and runtime disabled either component - thereby
avoiding the O(n^2) lock chain walks for instance.

TODO:
 - create new output format

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/7] fix raw_spinlock_t vs lockdep
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 2/7] lockdep: isolate lock graph walking Peter Zijlstra
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: raw_spinlock_fix.patch --]
[-- Type: text/plain, Size: 1826 bytes --]

raw_spinlock_t should not use lockdep (and doesn't) since lockdep itself relies
on it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/spinlock_types.h    |    4 ++--
 include/linux/spinlock_types_up.h |    9 +--------
 2 files changed, 3 insertions(+), 10 deletions(-)

Index: linux-2.6/include/linux/spinlock_types_up.h
===================================================================
--- linux-2.6.orig/include/linux/spinlock_types_up.h	2007-02-20 13:22:26.000000000 +0100
+++ linux-2.6/include/linux/spinlock_types_up.h	2007-05-21 14:42:55.000000000 +0200
@@ -12,14 +12,10 @@
  * Released under the General Public License (GPL).
  */
 
-#if defined(CONFIG_DEBUG_SPINLOCK) || \
-	defined(CONFIG_DEBUG_LOCK_ALLOC)
+#ifdef CONFIG_DEBUG_SPINLOCK
 
 typedef struct {
 	volatile unsigned int slock;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
 } raw_spinlock_t;
 
 #define __RAW_SPIN_LOCK_UNLOCKED { 1 }
@@ -34,9 +30,6 @@ typedef struct { } raw_spinlock_t;
 
 typedef struct {
 	/* no debug version on UP */
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
 } raw_rwlock_t;
 
 #define __RAW_RW_LOCK_UNLOCKED { }
Index: linux-2.6/include/linux/spinlock_types.h
===================================================================
--- linux-2.6.orig/include/linux/spinlock_types.h	2007-05-21 14:43:19.000000000 +0200
+++ linux-2.6/include/linux/spinlock_types.h	2007-05-21 14:43:31.000000000 +0200
@@ -9,14 +9,14 @@
  * Released under the General Public License (GPL).
  */
 
-#include <linux/lockdep.h>
-
 #if defined(CONFIG_SMP)
 # include <asm/spinlock_types.h>
 #else
 # include <linux/spinlock_types_up.h>
 #endif
 
+#include <linux/lockdep.h>
+
 typedef struct {
 	raw_spinlock_t raw_lock;
 #if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP)

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/7] lockdep: isolate lock graph walking
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 1/7] fix raw_spinlock_t vs lockdep Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 3/7] lockdep: lock contention tracking Peter Zijlstra
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: lockstat-prove-locking.patch --]
[-- Type: text/plain, Size: 3088 bytes --]

Put the lock dependancy tracking bits under CONFIG_PROVE_LOCKING
This way we can reuse the hold_lock part of lockdep for other purposes.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/lockdep.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c	2007-05-23 00:22:59.000000000 +0200
+++ linux-2.6/kernel/lockdep.c	2007-05-23 00:42:43.000000000 +0200
@@ -95,6 +95,7 @@ static int lockdep_initialized;
 unsigned long nr_list_entries;
 static struct lock_list list_entries[MAX_LOCKDEP_ENTRIES];
 
+#ifdef CONFIG_PROVE_LOCKING
 /*
  * Allocate a lockdep entry. (assumes the graph_lock held, returns
  * with NULL on failure)
@@ -111,6 +112,7 @@ static struct lock_list *alloc_list_entr
 	}
 	return list_entries + nr_list_entries++;
 }
+#endif
 
 /*
  * All data structures here are protected by the global debug_lock.
@@ -140,7 +142,9 @@ LIST_HEAD(all_lock_classes);
 static struct list_head classhash_table[CLASSHASH_SIZE];
 
 unsigned long nr_lock_chains;
+#ifdef CONFIG_PROVE_LOCKING
 static struct lock_chain lock_chains[MAX_LOCKDEP_CHAINS];
+#endif
 
 /*
  * We put the lock dependency chains into a hash-table as well, to cache
@@ -482,6 +486,7 @@ static void print_lock_dependencies(stru
 	}
 }
 
+#ifdef CONFIG_PROVE_LOCKING
 /*
  * Add a new dependency to the head of the list:
  */
@@ -541,6 +546,7 @@ print_circular_bug_entry(struct lock_lis
 
 	return 0;
 }
+#endif
 
 static void print_kernel_version(void)
 {
@@ -549,6 +555,7 @@ static void print_kernel_version(void)
 		init_utsname()->version);
 }
 
+#ifdef CONFIG_PROVE_LOCKING
 /*
  * When a circular dependency is detected, print the
  * header first:
@@ -639,6 +646,7 @@ check_noncircular(struct lock_class *sou
 	}
 	return 1;
 }
+#endif
 
 static int very_verbose(struct lock_class *class)
 {
@@ -823,6 +831,7 @@ check_usage(struct task_struct *curr, st
 
 #endif
 
+#ifdef CONFIG_PROVE_LOCKING
 static int
 print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
 		   struct held_lock *next)
@@ -1087,7 +1096,7 @@ out_bug:
 
 	return 0;
 }
-
+#endif
 
 /*
  * Is this the address of a static object:
@@ -1307,6 +1316,7 @@ out_unlock_set:
 	return class;
 }
 
+#ifdef CONFIG_PROVE_LOCKING
 /*
  * Look up a dependency chain. If the key is not present yet then
  * add it and return 1 - in this case the new dependency chain is
@@ -1381,6 +1391,7 @@ cache_hit:
 
 	return 1;
 }
+#endif
 
 /*
  * We are building curr_chain_key incrementally, so double-check
@@ -2168,6 +2179,7 @@ out_calc_hash:
 	chain_key = iterate_chain_key(chain_key, id);
 	curr->curr_chain_key = chain_key;
 
+#ifdef CONFIG_PROVE_LOCKING
 	/*
 	 * Trylock needs to maintain the stack of held locks, but it
 	 * does not add new dependencies, because trylock can be done
@@ -2214,6 +2226,7 @@ out_calc_hash:
 		/* after lookup_chain_cache(): */
 		if (unlikely(!debug_locks))
 			return 0;
+#endif
 
 	curr->lockdep_depth++;
 	check_chain_key(curr);

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/7] lockdep: lock contention tracking
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 1/7] fix raw_spinlock_t vs lockdep Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 2/7] lockdep: isolate lock graph walking Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23 14:40   ` Jason Baron
  2007-05-23  9:57 ` [PATCH 4/7] lockdep: add waittime to the lock statistics Peter Zijlstra
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: lockstat.patch --]
[-- Type: text/plain, Size: 17262 bytes --]

Count lock contention events per lock class. Additionally track the first four
callsites that resulted in the contention.

/proc/lock_contentions - gives a single line per lock class:

  <class name> <(write) contentions> <read contentions>

Followed by another three columns per contention point:

  <contentions from this point> <IP> <symbol name>

writing a '0' to the file will clear the stats.

example output:

[root@taijtu ~]# cat /proc/lock_contentions | sort -rnk 2 | head
dcache_lock: 3000 0 [618] [<ffffffff8033badd>] _atomic_dec_and_lock+0x39/0x58 [17] [<ffffffff802d6dce>] sysfs_open_file+0x28/0x25a [160] [<ffffffff802a5b61>] d_instantiate+0x2a/0xad [52] [<ffffffff8029d926>] __link_path_walk+0x270/0xde9
&inode->i_mutex: 235 0 [15] [<ffffffff8029b053>] pipe_read+0x74/0x39d [23] [<ffffffff8029a9f8>] pipe_wait+0x86/0x8d [105] [<ffffffff8029bc9a>] do_lookup+0x81/0x1ae [85] [<ffffffff802a0be7>] vfs_readdir+0x5c/0xa9
&zone->lru_lock: 141 0 [11] [<ffffffff8027366b>] activate_page+0x37/0xd6 [75] [<ffffffff80273327>] release_pages+0xb2/0x16c [42] [<ffffffff80273443>] __pagevec_lru_add_active+0x62/0xe5 [12] [<ffffffff80273529>] __pagevec_lru_add+0x63/0xe2
&rq->rq_lock_key: 93 0 [66] [<ffffffff8022b202>] task_rq_lock+0x42/0x74 [24] [<ffffffff80461f2a>] __sched_text_start+0x17a/0x869 [2] [<ffffffff8022a58e>] double_rq_lock+0x38/0x3d [1] [<ffffffff8022b1af>] double_lock_balance+0x41/0x52
&rq->rq_lock_key: 59 0 [15] [<ffffffff80461f2a>] __sched_text_start+0x17a/0x869 [27] [<ffffffff8022b202>] task_rq_lock+0x42/0x74 [17] [<ffffffff80462090>] __sched_text_start+0x2e0/0x869
files_lock: 57 0 [28] [<ffffffff8029581d>] file_move+0x1d/0x51 [29] [<ffffffff802957dc>] file_kill+0x15/0x39
&inode->i_data.i_mmap_lock: 52 0 [16] [<ffffffff8023149f>] copy_process+0xe10/0x1684 [5] [<ffffffff8027e38d>] vma_link+0x40/0x107 [20] [<ffffffff8027d866>] unlink_file_vma+0x2c/0x53 [11] [<ffffffff8027da53>] vma_adjust+0x154/0x452
&q->__queue_lock: 47 0 [16] [<ffffffff8033348a>] __make_request+0x5e/0x663 [4] [<ffffffff803339e8>] __make_request+0x5bc/0x663 [2] [<ffffffff803320c3>] generic_unplug_device+0x18/0x25 [1] [<ffffffff803320bb>] generic_unplug_device+0x10/0x25
&dentry->d_lock: 26 0 [22] [<ffffffff802a58be>] __d_lookup+0x7d/0x136 [4] [<ffffffff802ccd61>] dnotify_parent+0x1f/0x6d
vfsmount_lock: 17 0 [16] [<ffffffff802ab31d>] lookup_mnt+0x19/0x4c [1] [<ffffffff802a51df>] __d_path+0xae/0x14c

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |   33 ++++++++++++++++
 kernel/lockdep.c        |   95 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/lockdep_proc.c   |   79 +++++++++++++++++++++++++++++++++++++++
 kernel/mutex.c          |    7 +++
 kernel/rwsem.c          |    8 ++--
 kernel/spinlock.c       |   28 +++++++-------
 lib/Kconfig.debug       |   13 ++++++
 7 files changed, 245 insertions(+), 18 deletions(-)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h	2007-05-22 23:29:08.000000000 +0200
+++ linux-2.6/include/linux/lockdep.h	2007-05-22 23:46:47.000000000 +0200
@@ -16,6 +16,7 @@ struct task_struct;
 #include <linux/list.h>
 #include <linux/debug_locks.h>
 #include <linux/stacktrace.h>
+#include <asm/atomic.h>
 
 /*
  * Lock-class usage-state bits:
@@ -71,6 +72,11 @@ struct lock_class_key {
 	struct lockdep_subclass_key	subkeys[MAX_LOCKDEP_SUBCLASSES];
 };
 
+struct lock_contention_point {
+	unsigned long ip;
+	atomic_t count;
+};
+
 /*
  * The lock-class itself:
  */
@@ -114,6 +120,12 @@ struct lock_class {
 
 	const char			*name;
 	int				name_version;
+
+#ifdef CONFIG_LOCK_STAT
+	atomic_t			read_contentions;
+	atomic_t			write_contentions;
+	struct lock_contention_point	contention_point[4];
+#endif
 };
 
 /*
@@ -281,6 +293,27 @@ struct lock_class_key { };
 
 #endif /* !LOCKDEP */
 
+#ifdef CONFIG_LOCK_STAT
+
+extern void lock_contended(struct lockdep_map *lock, unsigned long ip);
+
+#define LOCK_CONTENDED(_lock, try, lock)			\
+do {								\
+	if (!try(_lock)) {					\
+		lock_contended(&(_lock)->dep_map, _RET_IP_);	\
+		lock(_lock);					\
+	}							\
+} while (0)
+
+#else /* CONFIG_LOCK_STAT */
+
+#define lock_contended(l, i)			do { } while (0)
+
+#define LOCK_CONTENDED(_lock, try, lock) \
+	lock(_lock)
+
+#endif /* CONFIG_LOCK_STAT */
+
 #if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_GENERIC_HARDIRQS)
 extern void early_init_irq_lock_class(void);
 #else
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c	2007-05-22 23:46:23.000000000 +0200
+++ linux-2.6/kernel/lockdep.c	2007-05-22 23:46:47.000000000 +0200
@@ -2274,6 +2274,35 @@ print_unlock_inbalance_bug(struct task_s
 	return 0;
 }
 
+#ifdef CONFIG_LOCK_STAT
+static int
+print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
+			   unsigned long ip)
+{
+	if (!debug_locks_off())
+		return 0;
+	if (debug_locks_silent)
+		return 0;
+
+	printk("\n=================================\n");
+	printk(  "[ BUG: bad contention detected! ]\n");
+	printk(  "---------------------------------\n");
+	printk("%s/%d is trying to contend lock (",
+		curr->comm, curr->pid);
+	print_lockdep_cache(lock);
+	printk(") at:\n");
+	print_ip_sym(ip);
+	printk("but there are no locks held!\n");
+	printk("\nother info that might help us debug this:\n");
+	lockdep_print_held_locks(curr);
+
+	printk("\nstack backtrace:\n");
+	dump_stack();
+
+	return 0;
+}
+#endif
+
 /*
  * Common debugging checks for both nested and non-nested unlock:
  */
@@ -2414,6 +2443,54 @@ __lock_release(struct lockdep_map *lock,
 	check_chain_key(curr);
 }
 
+#ifdef CONFIG_LOCK_STAT
+static void
+__lock_contended(struct lockdep_map *lock, unsigned long ip)
+{
+	struct task_struct *curr = current;
+	struct held_lock *hlock, *prev_hlock;
+	unsigned int depth;
+	int i;
+
+	depth = curr->lockdep_depth;
+	if (DEBUG_LOCKS_WARN_ON(!depth))
+		return;
+
+	prev_hlock = NULL;
+	for (i = depth-1; i >= 0; i--) {
+		hlock = curr->held_locks + i;
+		/*
+		 * We must not cross into another context:
+		 */
+		if (prev_hlock && prev_hlock->irq_context != hlock->irq_context)
+			break;
+		if (hlock->instance == lock)
+			goto found_it;
+		prev_hlock = hlock;
+	}
+	print_lock_contention_bug(curr, lock, ip);
+	return;
+
+found_it:
+	if (hlock->read)
+		atomic_inc(&hlock->class->read_contentions);
+	else
+		atomic_inc(&hlock->class->write_contentions);
+
+	for (i = 0; i < ARRAY_SIZE(hlock->class->contention_point); i++) {
+		if (hlock->class->contention_point[i].ip == 0) {
+			hlock->class->contention_point[i].ip = ip;
+			atomic_set(&hlock->class->contention_point[i].count, 1);
+			break;
+		}
+		if (hlock->class->contention_point[i].ip == ip) {
+			atomic_inc(&hlock->class->contention_point[i].count);
+			break;
+		}
+	}
+}
+#endif
+
 /*
  * Check whether we follow the irq-flags state precisely:
  */
@@ -2486,6 +2563,24 @@ void lock_release(struct lockdep_map *lo
 
 EXPORT_SYMBOL_GPL(lock_release);
 
+#ifdef CONFIG_LOCK_STAT
+void lock_contended(struct lockdep_map *lock, unsigned long ip)
+{
+	unsigned long flags;
+
+	if (unlikely(current->lockdep_recursion))
+		return;
+
+	raw_local_irq_save(flags);
+	check_flags(flags);
+	current->lockdep_recursion = 1;
+	__lock_contended(lock, ip);
+	current->lockdep_recursion = 0;
+	raw_local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(lock_contended);
+#endif
+
 /*
  * Used by the testsuite, sanitize the validator state
  * after a simulated failure:
Index: linux-2.6/kernel/mutex.c
===================================================================
--- linux-2.6.orig/kernel/mutex.c	2007-05-22 23:29:08.000000000 +0200
+++ linux-2.6/kernel/mutex.c	2007-05-22 23:46:47.000000000 +0200
@@ -139,6 +139,12 @@ __mutex_lock_common(struct mutex *lock, 
 	list_add_tail(&waiter.list, &lock->wait_list);
 	waiter.task = task;
 
+	old_val = atomic_xchg(&lock->count, -1);
+	if (old_val == 1)
+		goto done;
+
+	lock_contended(&lock->dep_map, _RET_IP_);
+
 	for (;;) {
 		/*
 		 * Lets try to take the lock again - this is needed even if
@@ -174,6 +180,7 @@ __mutex_lock_common(struct mutex *lock, 
 		spin_lock_mutex(&lock->wait_lock, flags);
 	}
 
+done:
 	/* got the lock - rejoice! */
 	mutex_remove_waiter(lock, &waiter, task_thread_info(task));
 	debug_mutex_set_owner(lock, task_thread_info(task));
Index: linux-2.6/kernel/lockdep_proc.c
===================================================================
--- linux-2.6.orig/kernel/lockdep_proc.c	2007-05-22 23:29:08.000000000 +0200
+++ linux-2.6/kernel/lockdep_proc.c	2007-05-22 23:46:47.000000000 +0200
@@ -15,6 +15,7 @@
 #include <linux/seq_file.h>
 #include <linux/kallsyms.h>
 #include <linux/debug_locks.h>
+#include <asm/uaccess.h>
 
 #include "lockdep_internals.h"
 
@@ -342,6 +343,78 @@ static const struct file_operations proc
 	.release	= seq_release,
 };
 
+#ifdef CONFIG_LOCK_STAT
+static int lock_contentions_show(struct seq_file *m, void *v)
+{
+	struct lock_contention_point *cp;
+	char sym[KSYM_SYMBOL_LEN];
+	struct lock_class *class;
+	int r, w, i;
+
+	list_for_each_entry(class, &all_lock_classes, lock_entry) {
+		r = atomic_read(&class->read_contentions);
+		w = atomic_read(&class->write_contentions);
+
+		if (r || w) {
+			seq_printf(m, "%s: %d %d", class->name, w, r);
+			for (i = 0; i < ARRAY_SIZE(class->contention_point);
+					i++) {
+				cp = &class->contention_point[i];
+
+				if (cp->ip == 0)
+					break;
+				sprint_symbol(sym, cp->ip);
+				seq_printf(m, " [%d] [<%p>] %s",
+						atomic_read(&cp->count),
+						(void *)cp->ip, sym);
+			}
+			seq_printf(m, "\n");
+		}
+	}
+
+	return 0;
+}
+
+static int lock_contentions_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, lock_contentions_show, NULL);
+}
+
+ssize_t lock_contentions_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct lock_class *class;
+	char c;
+	int i;
+
+	if (count) {
+		if (get_user(c, buf))
+			return -EFAULT;
+
+		if (c != '0')
+			return count;
+
+		list_for_each_entry(class, &all_lock_classes, lock_entry) {
+			atomic_set(&class->write_contentions, 0);
+			atomic_set(&class->read_contentions, 0);
+			for (i = 0; i < ARRAY_SIZE(class->contention_point);
+					i++) {
+				class->contention_point[i].ip = 0;
+			}
+		}
+	}
+	return count;
+}
+
+static const struct file_operations proc_lock_contentions_operations = {
+	.open		= lock_contentions_open,
+	.write		= lock_contentions_write,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
+#endif /* CONFIG_LOCK_STAT */
+
 static int __init lockdep_proc_init(void)
 {
 	struct proc_dir_entry *entry;
@@ -354,6 +427,12 @@ static int __init lockdep_proc_init(void
 	if (entry)
 		entry->proc_fops = &proc_lockdep_stats_operations;
 
+#ifdef CONFIG_LOCK_STAT
+	entry = create_proc_entry("lock_contentions", S_IRUSR, NULL);
+	if (entry)
+		entry->proc_fops = &proc_lock_contentions_operations;
+#endif
+
 	return 0;
 }
 
Index: linux-2.6/kernel/rwsem.c
===================================================================
--- linux-2.6.orig/kernel/rwsem.c	2007-05-22 23:29:08.000000000 +0200
+++ linux-2.6/kernel/rwsem.c	2007-05-22 23:46:47.000000000 +0200
@@ -20,7 +20,7 @@ void down_read(struct rw_semaphore *sem)
 	might_sleep();
 	rwsem_acquire_read(&sem->dep_map, 0, 0, _RET_IP_);
 
-	__down_read(sem);
+	LOCK_CONTENDED(sem, __down_read_trylock, __down_read);
 }
 
 EXPORT_SYMBOL(down_read);
@@ -47,7 +47,7 @@ void down_write(struct rw_semaphore *sem
 	might_sleep();
 	rwsem_acquire(&sem->dep_map, 0, 0, _RET_IP_);
 
-	__down_write(sem);
+	LOCK_CONTENDED(sem, __down_write_trylock, __down_write);
 }
 
 EXPORT_SYMBOL(down_write);
@@ -111,7 +111,7 @@ void down_read_nested(struct rw_semaphor
 	might_sleep();
 	rwsem_acquire_read(&sem->dep_map, subclass, 0, _RET_IP_);
 
-	__down_read(sem);
+	LOCK_CONTENDED(sem, __down_read_trylock, __down_read);
 }
 
 EXPORT_SYMBOL(down_read_nested);
@@ -130,7 +130,7 @@ void down_write_nested(struct rw_semapho
 	might_sleep();
 	rwsem_acquire(&sem->dep_map, subclass, 0, _RET_IP_);
 
-	__down_write_nested(sem, subclass);
+	LOCK_CONTENDED(sem, __down_write_trylock, __down_write);
 }
 
 EXPORT_SYMBOL(down_write_nested);
Index: linux-2.6/kernel/spinlock.c
===================================================================
--- linux-2.6.orig/kernel/spinlock.c	2007-05-22 23:29:08.000000000 +0200
+++ linux-2.6/kernel/spinlock.c	2007-05-22 23:46:47.000000000 +0200
@@ -72,7 +72,7 @@ void __lockfunc _read_lock(rwlock_t *loc
 {
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_read_lock(lock);
+	LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock);
 }
 EXPORT_SYMBOL(_read_lock);
 
@@ -89,7 +89,7 @@ unsigned long __lockfunc _spin_lock_irqs
 	 * that interrupts are not re-enabled during lock-acquire:
 	 */
 #ifdef CONFIG_PROVE_LOCKING
-	_raw_spin_lock(lock);
+	LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
 #else
 	_raw_spin_lock_flags(lock, &flags);
 #endif
@@ -102,7 +102,7 @@ void __lockfunc _spin_lock_irq(spinlock_
 	local_irq_disable();
 	preempt_disable();
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_spin_lock(lock);
+	LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
 }
 EXPORT_SYMBOL(_spin_lock_irq);
 
@@ -111,7 +111,7 @@ void __lockfunc _spin_lock_bh(spinlock_t
 	local_bh_disable();
 	preempt_disable();
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_spin_lock(lock);
+	LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
 }
 EXPORT_SYMBOL(_spin_lock_bh);
 
@@ -122,7 +122,7 @@ unsigned long __lockfunc _read_lock_irqs
 	local_irq_save(flags);
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_read_lock(lock);
+	LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock);
 	return flags;
 }
 EXPORT_SYMBOL(_read_lock_irqsave);
@@ -132,7 +132,7 @@ void __lockfunc _read_lock_irq(rwlock_t 
 	local_irq_disable();
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_read_lock(lock);
+	LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock);
 }
 EXPORT_SYMBOL(_read_lock_irq);
 
@@ -141,7 +141,7 @@ void __lockfunc _read_lock_bh(rwlock_t *
 	local_bh_disable();
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_read_lock(lock);
+	LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock);
 }
 EXPORT_SYMBOL(_read_lock_bh);
 
@@ -152,7 +152,7 @@ unsigned long __lockfunc _write_lock_irq
 	local_irq_save(flags);
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_write_lock(lock);
+	LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock);
 	return flags;
 }
 EXPORT_SYMBOL(_write_lock_irqsave);
@@ -162,7 +162,7 @@ void __lockfunc _write_lock_irq(rwlock_t
 	local_irq_disable();
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_write_lock(lock);
+	LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock);
 }
 EXPORT_SYMBOL(_write_lock_irq);
 
@@ -171,7 +171,7 @@ void __lockfunc _write_lock_bh(rwlock_t 
 	local_bh_disable();
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_write_lock(lock);
+	LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock);
 }
 EXPORT_SYMBOL(_write_lock_bh);
 
@@ -179,7 +179,7 @@ void __lockfunc _spin_lock(spinlock_t *l
 {
 	preempt_disable();
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_spin_lock(lock);
+	LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
 }
 
 EXPORT_SYMBOL(_spin_lock);
@@ -188,7 +188,7 @@ void __lockfunc _write_lock(rwlock_t *lo
 {
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	_raw_write_lock(lock);
+	LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock);
 }
 
 EXPORT_SYMBOL(_write_lock);
@@ -289,7 +289,7 @@ void __lockfunc _spin_lock_nested(spinlo
 {
 	preempt_disable();
 	spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
-	_raw_spin_lock(lock);
+	LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
 }
 
 EXPORT_SYMBOL(_spin_lock_nested);
@@ -306,7 +306,7 @@ unsigned long __lockfunc _spin_lock_irqs
 	 * that interrupts are not re-enabled during lock-acquire:
 	 */
 #ifdef CONFIG_PROVE_SPIN_LOCKING
-	_raw_spin_lock(lock);
+	LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
 #else
 	_raw_spin_lock_flags(lock, &flags);
 #endif
Index: linux-2.6/lib/Kconfig.debug
===================================================================
--- linux-2.6.orig/lib/Kconfig.debug	2007-05-22 23:29:08.000000000 +0200
+++ linux-2.6/lib/Kconfig.debug	2007-05-22 23:46:47.000000000 +0200
@@ -258,6 +258,17 @@ config LOCKDEP
 	select KALLSYMS
 	select KALLSYMS_ALL
 
+config LOCK_STAT
+	bool "Lock usage statisitics"
+	depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
+	select LOCKDEP
+	select DEBUG_SPINLOCK
+	select DEBUG_MUTEXES
+	select DEBUG_LOCK_ALLOC
+	default n
+	help
+	 This feature enables tracking lock contention points
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
@@ -442,3 +453,5 @@ config FAULT_INJECTION_STACKTRACE_FILTER
 	select FRAME_POINTER
 	help
 	  Provide stacktrace filter for fault-injection capabilities
+
+

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/7] lockdep: add waittime to the lock statistics
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
                   ` (2 preceding siblings ...)
  2007-05-23  9:57 ` [PATCH 3/7] lockdep: lock contention tracking Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 5/7] lockdep: add holdtime " Peter Zijlstra
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: lockstat-waittime.patch --]
[-- Type: text/plain, Size: 8683 bytes --]

Track lock waittime - that is the time spend waiting on lock acquisition.

/proc/lock_waittime - starts with the same three colums as lock_contentions:

  <class name> <(write) contentions> <read contentions>

After that come two times four more columns:

  <nr> <min> <max> <total>

for both (write) contentions and read contentions. Times are in 1e-9 seconds.
Writing a 0 to the file clears the stats.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |   15 ++++++++
 kernel/lockdep.c        |   79 +++++++++++++++++++++++++++++++++++++++++++++
 kernel/lockdep_proc.c   |   84 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/mutex.c          |    1 
 4 files changed, 179 insertions(+)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h	2007-05-22 23:46:47.000000000 +0200
+++ linux-2.6/include/linux/lockdep.h	2007-05-22 23:46:53.000000000 +0200
@@ -77,6 +77,13 @@ struct lock_contention_point {
 	atomic_t count;
 };
 
+struct lock_time {
+	raw_spinlock_t lock;
+	unsigned long long min, max;
+	unsigned long long total;
+	unsigned long nr;
+};
+
 /*
  * The lock-class itself:
  */
@@ -125,6 +132,8 @@ struct lock_class {
 	atomic_t			read_contentions;
 	atomic_t			write_contentions;
 	struct lock_contention_point	contention_point[4];
+	struct lock_time		read_waittime;
+	struct lock_time		write_waittime;
 #endif
 };
 
@@ -177,6 +186,9 @@ struct held_lock {
 	unsigned long			acquire_ip;
 	struct lockdep_map		*instance;
 
+#ifdef CONFIG_LOCK_STAT
+	unsigned long long 		waittime_stamp;
+#endif
 	/*
 	 * The lock-stack is unified in that the lock chains of interrupt
 	 * contexts nest ontop of process context chains, but we 'separate'
@@ -296,18 +308,21 @@ struct lock_class_key { };
 #ifdef CONFIG_LOCK_STAT
 
 extern void lock_contended(struct lockdep_map *lock, unsigned long ip);
+extern void lock_acquired(struct lockdep_map *lock);
 
 #define LOCK_CONTENDED(_lock, try, lock)			\
 do {								\
 	if (!try(_lock)) {					\
 		lock_contended(&(_lock)->dep_map, _RET_IP_);	\
 		lock(_lock);					\
+		lock_acquired(&(_lock)->dep_map);		\
 	}							\
 } while (0)
 
 #else /* CONFIG_LOCK_STAT */
 
 #define lock_contended(l, i)			do { } while (0)
+#define lock_acquired(l)			do { } while (0)
 
 #define LOCK_CONTENDED(_lock, try, lock) \
 	lock(_lock)
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c	2007-05-22 23:46:47.000000000 +0200
+++ linux-2.6/kernel/lockdep.c	2007-05-22 23:47:06.000000000 +0200
@@ -1280,6 +1280,10 @@ register_lock_class(struct lockdep_map *
 	INIT_LIST_HEAD(&class->lock_entry);
 	INIT_LIST_HEAD(&class->locks_before);
 	INIT_LIST_HEAD(&class->locks_after);
+#ifdef CONFIG_LOCK_STAT
+	class->read_waittime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
+	class->write_waittime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
+#endif
 	class->name_version = count_matching_names(class);
 	/*
 	 * We use RCU's safe list-add method to make
@@ -2081,6 +2085,9 @@ static int __lock_acquire(struct lockdep
 	hlock->read = read;
 	hlock->check = check;
 	hlock->hardirqs_off = hardirqs_off;
+#ifdef CONFIG_LOCK_STAT
+	hlock->waittime_stamp = 0;
+#endif
 
 	if (check != 2)
 		goto out_calc_hash;
@@ -2320,6 +2327,20 @@ static int check_unlock(struct task_stru
 	return 1;
 }
 
+static void lock_time_add(struct lock_time *lt, unsigned long long wt)
+{
+	__raw_spin_lock(&lt->lock);
+	if (wt > lt->max)
+		lt->max = wt;
+
+	if (wt < lt->min || !lt->min)
+		lt->min = wt;
+
+	lt->total += wt;
+	lt->nr++;
+	__raw_spin_unlock(&lt->lock);
+}
+
 /*
  * Remove the lock to the list of currently held locks in a
  * potentially non-nested (out of order) manner. This is a
@@ -2472,6 +2493,8 @@ __lock_contended(struct lockdep_map *loc
 	return;
 
 found_it:
+	hlock->waittime_stamp = sched_clock();
+
 	if (hlock->read)
 		atomic_inc(&hlock->class->read_contentions);
 	else
@@ -2489,6 +2512,46 @@ found_it:
 		}
 	}
 }
+
+static void
+__lock_acquired(struct lockdep_map *lock)
+{
+	struct task_struct *curr = current;
+	struct held_lock *hlock, *prev_hlock;
+	unsigned int depth;
+	unsigned long long waittime;
+	int i;
+
+	depth = curr->lockdep_depth;
+	if (DEBUG_LOCKS_WARN_ON(!depth))
+		return;
+
+	prev_hlock = NULL;
+	for (i = depth-1; i >= 0; i--) {
+		hlock = curr->held_locks + i;
+		/*
+		 * We must not cross into another context:
+		 */
+		if (prev_hlock && prev_hlock->irq_context != hlock->irq_context)
+			break;
+		if (hlock->instance == lock)
+			goto found_it;
+		prev_hlock = hlock;
+	}
+	print_lock_contention_bug(curr, lock, _RET_IP_);
+	return;
+
+found_it:
+	if (!hlock->waittime_stamp)
+		return;
+
+	waittime = sched_clock() - hlock->waittime_stamp;
+
+	if (hlock->read)
+		lock_time_add(&hlock->class->read_waittime, waittime);
+	else
+		lock_time_add(&hlock->class->write_waittime, waittime);
+}
 #endif
 
 /*
@@ -2579,6 +2642,22 @@ void lock_contended(struct lockdep_map *
 	raw_local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(lock_contended);
+
+void lock_acquired(struct lockdep_map *lock)
+{
+	unsigned long flags;
+
+	if (unlikely(current->lockdep_recursion))
+		return;
+
+	raw_local_irq_save(flags);
+	check_flags(flags);
+	current->lockdep_recursion = 1;
+	__lock_acquired(lock);
+	current->lockdep_recursion = 0;
+	raw_local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(lock_acquired);
 #endif
 
 /*
Index: linux-2.6/kernel/lockdep_proc.c
===================================================================
--- linux-2.6.orig/kernel/lockdep_proc.c	2007-05-22 23:46:47.000000000 +0200
+++ linux-2.6/kernel/lockdep_proc.c	2007-05-22 23:46:53.000000000 +0200
@@ -16,6 +16,7 @@
 #include <linux/kallsyms.h>
 #include <linux/debug_locks.h>
 #include <asm/uaccess.h>
+#include <asm/div64.h>
 
 #include "lockdep_internals.h"
 
@@ -413,6 +414,85 @@ static const struct file_operations proc
 	.llseek		= seq_lseek,
 	.release	= seq_release,
 };
+
+static void print_time(struct seq_file *m, struct lock_time *lt)
+{
+	unsigned long long min, total, max;
+	unsigned long nr;
+
+	__raw_spin_lock(&lt->lock);
+	min = lt->min;
+	total = lt->total;
+	max = lt->max;
+	nr = lt->nr;
+	__raw_spin_unlock(&lt->lock);
+
+	seq_printf(m, " %lu %llu %llu %llu", nr, min, max, total);
+}
+
+static void clear_time(struct lock_time *lt)
+{
+	__raw_spin_lock(&lt->lock);
+	lt->min = 0;
+	lt->total = 0;
+	lt->max = 0;
+	lt->nr = 0;
+	__raw_spin_unlock(&lt->lock);
+}
+
+static int lock_waittime_show(struct seq_file *m, void *v)
+{
+	struct lock_class *class;
+	int r, w;
+
+	list_for_each_entry(class, &all_lock_classes, lock_entry) {
+		r = atomic_read(&class->read_contentions);
+		w = atomic_read(&class->write_contentions);
+
+		if (r || w) {
+			seq_printf(m, "%s: %d %d", class->name, w, r);
+			print_time(m, &class->write_waittime);
+			print_time(m, &class->read_waittime);
+			seq_printf(m, "\n");
+		}
+	}
+
+	return 0;
+}
+
+static int lock_waittime_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, lock_waittime_show, NULL);
+}
+
+ssize_t lock_waittime_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct lock_class *class;
+	char c;
+
+	if (count) {
+		if (get_user(c, buf))
+			return -EFAULT;
+
+		if (c != '0')
+			return count;
+
+		list_for_each_entry(class, &all_lock_classes, lock_entry) {
+			clear_time(&class->read_waittime);
+			clear_time(&class->write_waittime);
+		}
+	}
+	return count;
+}
+
+static const struct file_operations proc_lock_waittime_operations = {
+	.open		= lock_waittime_open,
+	.write		= lock_waittime_write,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
 #endif /* CONFIG_LOCK_STAT */
 
 static int __init lockdep_proc_init(void)
@@ -431,6 +511,10 @@ static int __init lockdep_proc_init(void
 	entry = create_proc_entry("lock_contentions", S_IRUSR, NULL);
 	if (entry)
 		entry->proc_fops = &proc_lock_contentions_operations;
+
+	entry = create_proc_entry("lock_waittime", S_IRUSR, NULL);
+	if (entry)
+		entry->proc_fops = &proc_lock_waittime_operations;
 #endif
 
 	return 0;
Index: linux-2.6/kernel/mutex.c
===================================================================
--- linux-2.6.orig/kernel/mutex.c	2007-05-22 23:46:47.000000000 +0200
+++ linux-2.6/kernel/mutex.c	2007-05-22 23:46:53.000000000 +0200
@@ -180,6 +180,7 @@ __mutex_lock_common(struct mutex *lock, 
 		spin_lock_mutex(&lock->wait_lock, flags);
 	}
 
+	lock_acquired(&lock->dep_map);
 done:
 	/* got the lock - rejoice! */
 	mutex_remove_waiter(lock, &waiter, task_thread_info(task));

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5/7] lockdep: add holdtime to the lock statistics
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
                   ` (3 preceding siblings ...)
  2007-05-23  9:57 ` [PATCH 4/7] lockdep: add waittime to the lock statistics Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 6/7] lockdep: runtime configure prove_locking and lock_stat Peter Zijlstra
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: lockstat-holdtime.patch --]
[-- Type: text/plain, Size: 6245 bytes --]

Track lock hold times - that is the time we held the lock.

/proc/lock_holdtime - starts with the same three colums as lock_contentions:

  <class name> <(write) contentions> <read contentions>

After that come two times four more columns:

  <nr> <min> <max> <total>

for both (write) locks and read locks. Times are in 1e-9 seconds.
Writing a 0 to the file clears the stats.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |    3 ++
 kernel/lockdep.c        |   32 ++++++++++++++++++++++++--
 kernel/lockdep_proc.c   |   58 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 2 deletions(-)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h	2007-05-22 23:46:53.000000000 +0200
+++ linux-2.6/include/linux/lockdep.h	2007-05-22 23:47:16.000000000 +0200
@@ -134,6 +134,8 @@ struct lock_class {
 	struct lock_contention_point	contention_point[4];
 	struct lock_time		read_waittime;
 	struct lock_time		write_waittime;
+	struct lock_time		read_holdtime;
+	struct lock_time		write_holdtime;
 #endif
 };
 
@@ -188,6 +190,7 @@ struct held_lock {
 
 #ifdef CONFIG_LOCK_STAT
 	unsigned long long 		waittime_stamp;
+	unsigned long long		holdtime_stamp;
 #endif
 	/*
 	 * The lock-stack is unified in that the lock chains of interrupt
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c	2007-05-22 23:47:06.000000000 +0200
+++ linux-2.6/kernel/lockdep.c	2007-05-22 23:47:23.000000000 +0200
@@ -1283,6 +1283,8 @@ register_lock_class(struct lockdep_map *
 #ifdef CONFIG_LOCK_STAT
 	class->read_waittime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
 	class->write_waittime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
+	class->read_holdtime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
+	class->write_holdtime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
 #endif
 	class->name_version = count_matching_names(class);
 	/*
@@ -2087,6 +2089,7 @@ static int __lock_acquire(struct lockdep
 	hlock->hardirqs_off = hardirqs_off;
 #ifdef CONFIG_LOCK_STAT
 	hlock->waittime_stamp = 0;
+	hlock->holdtime_stamp = sched_clock();
 #endif
 
 	if (check != 2)
@@ -2327,6 +2330,7 @@ static int check_unlock(struct task_stru
 	return 1;
 }
 
+#ifdef CONFIG_LOCK_STAT
 static void lock_time_add(struct lock_time *lt, unsigned long long wt)
 {
 	__raw_spin_lock(&lt->lock);
@@ -2341,6 +2345,23 @@ static void lock_time_add(struct lock_ti
 	__raw_spin_unlock(&lt->lock);
 }
 
+static void lock_release_holdtime(struct held_lock *hlock)
+{
+	unsigned long long holdtime;
+
+	holdtime = sched_clock() - hlock->holdtime_stamp;
+
+	if (hlock->read)
+		lock_time_add(&hlock->class->read_holdtime, holdtime);
+	else
+		lock_time_add(&hlock->class->write_holdtime, holdtime);
+}
+#else
+static void lock_release_holdtime(struct held_lock *hlock)
+{
+}
+#endif
+
 /*
  * Remove the lock to the list of currently held locks in a
  * potentially non-nested (out of order) manner. This is a
@@ -2378,6 +2399,8 @@ lock_release_non_nested(struct task_stru
 	return print_unlock_inbalance_bug(curr, lock, ip);
 
 found_it:
+	lock_release_holdtime(hlock);
+
 	/*
 	 * We have the right lock to unlock, 'hlock' points to it.
 	 * Now we remove it from the stack, and add back the other
@@ -2430,6 +2453,8 @@ static int lock_release_nested(struct ta
 
 	curr->curr_chain_key = hlock->prev_chain_key;
 
+	lock_release_holdtime(hlock);
+
 #ifdef CONFIG_DEBUG_LOCKDEP
 	hlock->prev_chain_key = 0;
 	hlock->class = NULL;
@@ -2519,7 +2544,7 @@ __lock_acquired(struct lockdep_map *lock
 	struct task_struct *curr = current;
 	struct held_lock *hlock, *prev_hlock;
 	unsigned int depth;
-	unsigned long long waittime;
+	unsigned long long now, waittime;
 	int i;
 
 	depth = curr->lockdep_depth;
@@ -2545,7 +2570,10 @@ found_it:
 	if (!hlock->waittime_stamp)
 		return;
 
-	waittime = sched_clock() - hlock->waittime_stamp;
+	now = sched_clock();
+	waittime = now - hlock->waittime_stamp;
+
+	hlock->holdtime_stamp = now;
 
 	if (hlock->read)
 		lock_time_add(&hlock->class->read_waittime, waittime);
Index: linux-2.6/kernel/lockdep_proc.c
===================================================================
--- linux-2.6.orig/kernel/lockdep_proc.c	2007-05-22 23:46:53.000000000 +0200
+++ linux-2.6/kernel/lockdep_proc.c	2007-05-22 23:47:16.000000000 +0200
@@ -493,6 +493,60 @@ static const struct file_operations proc
 	.llseek		= seq_lseek,
 	.release	= seq_release,
 };
+
+static int lock_holdtime_show(struct seq_file *m, void *v)
+{
+	struct lock_class *class;
+	int r, w;
+
+	list_for_each_entry(class, &all_lock_classes, lock_entry) {
+		r = atomic_read(&class->read_contentions);
+		w = atomic_read(&class->write_contentions);
+
+		if (r || w) {
+			seq_printf(m, "%s: %d %d", class->name, w, r);
+			print_time(m, &class->write_holdtime);
+			print_time(m, &class->read_holdtime);
+			seq_printf(m, "\n");
+		}
+	}
+
+	return 0;
+}
+
+static int lock_holdtime_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, lock_holdtime_show, NULL);
+}
+
+ssize_t lock_holdtime_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct lock_class *class;
+	char c;
+
+	if (count) {
+		if (get_user(c, buf))
+			return -EFAULT;
+
+		if (c != '0')
+			return count;
+
+		list_for_each_entry(class, &all_lock_classes, lock_entry) {
+			clear_time(&class->read_holdtime);
+			clear_time(&class->write_holdtime);
+		}
+	}
+	return count;
+}
+
+static const struct file_operations proc_lock_holdtime_operations = {
+	.open		= lock_holdtime_open,
+	.write		= lock_holdtime_write,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
 #endif /* CONFIG_LOCK_STAT */
 
 static int __init lockdep_proc_init(void)
@@ -515,6 +569,10 @@ static int __init lockdep_proc_init(void
 	entry = create_proc_entry("lock_waittime", S_IRUSR, NULL);
 	if (entry)
 		entry->proc_fops = &proc_lock_waittime_operations;
+
+	entry = create_proc_entry("lock_holdtime", S_IRUSR, NULL);
+	if (entry)
+		entry->proc_fops = &proc_lock_holdtime_operations;
 #endif
 
 	return 0;

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 6/7] lockdep: runtime configure prove_locking and lock_stat
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
                   ` (4 preceding siblings ...)
  2007-05-23  9:57 ` [PATCH 5/7] lockdep: add holdtime " Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23  9:57 ` [PATCH 7/7] lockdep: scalable statistics Peter Zijlstra
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: lockstat-sysctl.patch --]
[-- Type: text/plain, Size: 3523 bytes --]

Allow for runtime configuration of lockdep and lockstat.

  /proc/sys/kernel/prove_locking
  /proc/sys/kernel/lock_stat

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/lockdep.c |   27 +++++++++++++++++++++++++++
 kernel/sysctl.c  |   27 +++++++++++++++++++++++++++
 2 files changed, 54 insertions(+)

Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c	2007-05-23 01:14:03.000000000 +0200
+++ linux-2.6/kernel/lockdep.c	2007-05-23 01:14:25.000000000 +0200
@@ -42,6 +42,16 @@
 
 #include "lockdep_internals.h"
 
+#ifdef CONFIG_PROVE_LOCKING
+int prove_locking = 1;
+module_param(prove_locking, int, 0644);
+#endif
+
+#ifdef CONFIG_LOCK_STAT
+int lock_stat = 1;
+module_param(lock_stat, int, 0644);
+#endif
+
 /*
  * lockdep_lock: protects the lockdep graph, the hashes and the
  *               class/list/hash allocators.
@@ -2091,9 +2101,13 @@ static int __lock_acquire(struct lockdep
 	hlock->waittime_stamp = 0;
 	hlock->holdtime_stamp = sched_clock();
 #endif
+#ifdef CONFIG_PROVE_LOCKING
+	if (!prove_locking)
+		goto out_irqflags;
 
 	if (check != 2)
 		goto out_calc_hash;
+#endif
 #ifdef CONFIG_TRACE_IRQFLAGS
 	/*
 	 * If non-trylock use in a hardirq or softirq context, then
@@ -2138,6 +2152,7 @@ static int __lock_acquire(struct lockdep
 		}
 	}
 #endif
+out_irqflags:
 	/* mark it as used: */
 	if (!mark_lock(curr, hlock, LOCK_USED))
 		return 0;
@@ -2190,6 +2205,8 @@ out_calc_hash:
 	curr->curr_chain_key = chain_key;
 
 #ifdef CONFIG_PROVE_LOCKING
+	if (!prove_locking)
+		goto out_graph;
 	/*
 	 * Trylock needs to maintain the stack of held locks, but it
 	 * does not add new dependencies, because trylock can be done
@@ -2236,6 +2253,7 @@ out_calc_hash:
 		/* after lookup_chain_cache(): */
 		if (unlikely(!debug_locks))
 			return 0;
+out_graph:
 #endif
 
 	curr->lockdep_depth++;
@@ -2349,6 +2367,9 @@ static void lock_release_holdtime(struct
 {
 	unsigned long long holdtime;
 
+	if (!lock_stat)
+		return;
+
 	holdtime = sched_clock() - hlock->holdtime_stamp;
 
 	if (hlock->read)
@@ -2659,6 +2680,9 @@ void lock_contended(struct lockdep_map *
 {
 	unsigned long flags;
 
+	if (!lock_stat)
+		return;
+
 	if (unlikely(current->lockdep_recursion))
 		return;
 
@@ -2675,6 +2699,9 @@ void lock_acquired(struct lockdep_map *l
 {
 	unsigned long flags;
 
+	if (!lock_stat)
+		return;
+
 	if (unlikely(current->lockdep_recursion))
 		return;
 
Index: linux-2.6/kernel/sysctl.c
===================================================================
--- linux-2.6.orig/kernel/sysctl.c	2007-05-23 01:13:01.000000000 +0200
+++ linux-2.6/kernel/sysctl.c	2007-05-23 01:14:10.000000000 +0200
@@ -160,6 +160,13 @@ extern ctl_table inotify_table[];
 int sysctl_legacy_va_layout;
 #endif
 
+#ifdef CONFIG_PROVE_LOCKING
+extern int prove_locking;
+#endif
+
+#ifdef CONFIG_LOCK_STAT
+extern int lock_stat;
+#endif
 
 /* The default sysctl tables: */
 
@@ -615,6 +622,26 @@ static ctl_table kern_table[] = {
 		.proc_handler   = &proc_dointvec,
 	},
 #endif
+#ifdef CONFIG_PROVE_LOCKING
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "prove_locking",
+		.data		= &prove_locking,
+		.maxlen		= sizeof(int),
+		.mode		= 0664,
+		.proc_handler	= &proc_dointvec,
+	},
+#endif
+#ifdef CONFIG_LOCK_STAT
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "lock_stat",
+		.data		= &lock_stat,
+		.maxlen		= sizeof(int),
+		.mode		= 0664,
+		.proc_handler	= &proc_dointvec,
+	},
+#endif
 
 	{ .ctl_name = 0 }
 };

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 7/7] lockdep: scalable statistics
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
                   ` (5 preceding siblings ...)
  2007-05-23  9:57 ` [PATCH 6/7] lockdep: runtime configure prove_locking and lock_stat Peter Zijlstra
@ 2007-05-23  9:57 ` Peter Zijlstra
  2007-05-23 10:33 ` [PATCH 0/7] lock contention tracking -v2 Ingo Molnar
  2007-05-23 15:10 ` [PATCH 8/7] new output format Peter Zijlstra
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23  9:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra

[-- Attachment #1: lockstat-scalability.patch --]
[-- Type: text/plain, Size: 14253 bytes --]

Rework the lock stat counters to be scalable.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |   34 +++++-----
 kernel/lockdep.c        |  162 ++++++++++++++++++++++++++++++++++++------------
 kernel/lockdep_proc.c   |  136 ++++++++++------------------------------
 3 files changed, 178 insertions(+), 154 deletions(-)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h	2007-05-23 10:23:16.000000000 +0200
+++ linux-2.6/include/linux/lockdep.h	2007-05-23 10:36:45.000000000 +0200
@@ -72,18 +72,6 @@ struct lock_class_key {
 	struct lockdep_subclass_key	subkeys[MAX_LOCKDEP_SUBCLASSES];
 };
 
-struct lock_contention_point {
-	unsigned long ip;
-	atomic_t count;
-};
-
-struct lock_time {
-	raw_spinlock_t lock;
-	unsigned long long min, max;
-	unsigned long long total;
-	unsigned long nr;
-};
-
 /*
  * The lock-class itself:
  */
@@ -128,17 +116,31 @@ struct lock_class {
 	const char			*name;
 	int				name_version;
 
+	unsigned long			contention_point[4];
+};
+
 #ifdef CONFIG_LOCK_STAT
-	atomic_t			read_contentions;
-	atomic_t			write_contentions;
-	struct lock_contention_point	contention_point[4];
+struct lock_time {
+	unsigned long long		min;
+	unsigned long long		max;
+	unsigned long long		total;
+	unsigned long			nr;
+};
+
+struct lock_class_stats {
+	unsigned long			read_contentions;
+	unsigned long			write_contentions;
+	unsigned long			contention_point[4];
 	struct lock_time		read_waittime;
 	struct lock_time		write_waittime;
 	struct lock_time		read_holdtime;
 	struct lock_time		write_holdtime;
-#endif
 };
 
+struct lock_class_stats lock_stats(struct lock_class *class);
+void clear_lock_stats(struct lock_class *class);
+#endif
+
 /*
  * Map the lock object (the lock instance) to the lock-class object.
  * This is embedded into specific lock instances:
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c	2007-05-23 10:23:16.000000000 +0200
+++ linux-2.6/kernel/lockdep.c	2007-05-23 11:50:13.000000000 +0200
@@ -133,6 +133,105 @@ static struct lock_list *alloc_list_entr
 unsigned long nr_lock_classes;
 static struct lock_class lock_classes[MAX_LOCKDEP_KEYS];
 
+#ifdef CONFIG_LOCK_STAT
+static DEFINE_PER_CPU(struct lock_class_stats[MAX_LOCKDEP_KEYS], lock_stats);
+
+static int lock_contention_point(struct lock_class *class, unsigned long ip)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(class->contention_point); i++) {
+		if (class->contention_point[i] == 0) {
+			class->contention_point[i] = ip;
+			break;
+		}
+		if (class->contention_point[i] == ip)
+			break;
+	}
+
+	return i;
+}
+
+static inline void lock_contention_add(struct lock_class_stats *src,
+		struct lock_class_stats *dst)
+{
+	int i;
+
+	dst->read_contentions += src->read_contentions;
+	dst->write_contentions += src->write_contentions;
+	for (i = 0; i < ARRAY_SIZE(dst->contention_point); i++)
+		dst->contention_point[i] += src->contention_point[i];
+}
+
+static inline void lock_time_add(struct lock_time *src, struct lock_time *dst)
+{
+	dst->min += src->min;
+	dst->max += src->max;
+	dst->total += src->total;
+	dst->nr += src->nr;
+}
+
+static void lock_time_inc(struct lock_time *lt, unsigned long long time)
+{
+	if (time > lt->max)
+		lt->max = time;
+
+	if (time < lt->min || !lt->min)
+		lt->min = time;
+
+	lt->total += time;
+	lt->nr++;
+}
+
+struct lock_class_stats lock_stats(struct lock_class *class)
+{
+	struct lock_class_stats stats;
+	int cpu, i;
+
+	memset(&stats, 0, sizeof(struct lock_class_stats));
+	for_each_possible_cpu(cpu) {
+		struct lock_class_stats *pcs =
+			&per_cpu(lock_stats, cpu)[class - lock_classes];
+
+		stats.read_contentions += pcs->read_contentions;
+		stats.write_contentions += pcs->write_contentions;
+		for (i = 0; i < ARRAY_SIZE(stats.contention_point); i++)
+			stats.contention_point[i] += pcs->contention_point[i];
+
+		lock_time_add(&pcs->read_waittime, &stats.read_waittime);
+		lock_time_add(&pcs->write_waittime, &stats.write_waittime);
+
+		lock_time_add(&pcs->read_holdtime, &stats.read_holdtime);
+		lock_time_add(&pcs->write_holdtime, &stats.write_holdtime);
+	}
+
+	return stats;
+}
+
+void clear_lock_stats(struct lock_class *class)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		struct lock_class_stats *cpu_stats =
+			&per_cpu(lock_stats, cpu)[class - lock_classes];
+
+		memset(cpu_stats, 0, sizeof(struct lock_class_stats));
+	}
+	memset(class->contention_point, 0, sizeof(class->contention_point));
+}
+
+static struct lock_class_stats *get_lock_stats(struct lock_class *class)
+{
+	return &get_cpu_var(lock_stats)[class - lock_classes];
+}
+
+static void put_lock_stats(struct lock_class_stats *stats)
+{
+	put_cpu_var(lock_stats);
+}
+#endif
+
 /*
  * We keep a global list of all lock classes. The list only grows,
  * never shrinks. The list is only accessed with the lockdep
@@ -1290,12 +1389,6 @@ register_lock_class(struct lockdep_map *
 	INIT_LIST_HEAD(&class->lock_entry);
 	INIT_LIST_HEAD(&class->locks_before);
 	INIT_LIST_HEAD(&class->locks_after);
-#ifdef CONFIG_LOCK_STAT
-	class->read_waittime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
-	class->write_waittime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
-	class->read_holdtime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
-	class->write_holdtime.lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
-#endif
 	class->name_version = count_matching_names(class);
 	/*
 	 * We use RCU's safe list-add method to make
@@ -2349,22 +2442,9 @@ static int check_unlock(struct task_stru
 }
 
 #ifdef CONFIG_LOCK_STAT
-static void lock_time_add(struct lock_time *lt, unsigned long long wt)
-{
-	__raw_spin_lock(&lt->lock);
-	if (wt > lt->max)
-		lt->max = wt;
-
-	if (wt < lt->min || !lt->min)
-		lt->min = wt;
-
-	lt->total += wt;
-	lt->nr++;
-	__raw_spin_unlock(&lt->lock);
-}
-
 static void lock_release_holdtime(struct held_lock *hlock)
 {
+	struct lock_class_stats *stats;
 	unsigned long long holdtime;
 
 	if (!lock_stat)
@@ -2372,10 +2452,14 @@ static void lock_release_holdtime(struct
 
 	holdtime = sched_clock() - hlock->holdtime_stamp;
 
+	stats = get_lock_stats(hlock->class);
+
 	if (hlock->read)
-		lock_time_add(&hlock->class->read_holdtime, holdtime);
+		lock_time_inc(&stats->read_holdtime, holdtime);
 	else
-		lock_time_add(&hlock->class->write_holdtime, holdtime);
+		lock_time_inc(&stats->write_holdtime, holdtime);
+
+	put_lock_stats(stats);
 }
 #else
 static void lock_release_holdtime(struct held_lock *hlock)
@@ -2516,8 +2600,9 @@ __lock_contended(struct lockdep_map *loc
 {
 	struct task_struct *curr = current;
 	struct held_lock *hlock, *prev_hlock;
+	struct lock_class_stats *stats;
 	unsigned int depth;
-	int i;
+	int i, point;
 
 	depth = curr->lockdep_depth;
 	if (DEBUG_LOCKS_WARN_ON(!depth))
@@ -2541,22 +2626,18 @@ __lock_contended(struct lockdep_map *loc
 found_it:
 	hlock->waittime_stamp = sched_clock();
 
+	point = lock_contention_point(hlock->class, ip);
+
+	stats = get_lock_stats(hlock->class);
+	if (point < ARRAY_SIZE(stats->contention_point))
+		stats->contention_point[i]++;
+
 	if (hlock->read)
-		atomic_inc(&hlock->class->read_contentions);
+		stats->read_contentions++;
 	else
-		atomic_inc(&hlock->class->write_contentions);
+		stats->write_contentions++;
 
-	for (i = 0; i < ARRAY_SIZE(hlock->class->contention_point); i++) {
-		if (hlock->class->contention_point[i].ip == 0) {
-			hlock->class->contention_point[i].ip = ip;
-			atomic_set(&hlock->class->contention_point[i].count, 1);
-			break;
-		}
-		if (hlock->class->contention_point[i].ip == ip) {
-			atomic_inc(&hlock->class->contention_point[i].count);
-			break;
-		}
-	}
+	put_lock_stats(stats);
 }
 
 static void
@@ -2564,6 +2645,7 @@ __lock_acquired(struct lockdep_map *lock
 {
 	struct task_struct *curr = current;
 	struct held_lock *hlock, *prev_hlock;
+	struct lock_class_stats *stats;
 	unsigned int depth;
 	unsigned long long now, waittime;
 	int i;
@@ -2596,10 +2678,14 @@ found_it:
 
 	hlock->holdtime_stamp = now;
 
+	stats = get_lock_stats(hlock->class);
+
 	if (hlock->read)
-		lock_time_add(&hlock->class->read_waittime, waittime);
+		lock_time_inc(&stats->read_waittime, waittime);
 	else
-		lock_time_add(&hlock->class->write_waittime, waittime);
+		lock_time_inc(&stats->write_waittime, waittime);
+
+	put_lock_stats(stats);
 }
 #endif
 
Index: linux-2.6/kernel/lockdep_proc.c
===================================================================
--- linux-2.6.orig/kernel/lockdep_proc.c	2007-05-23 10:23:16.000000000 +0200
+++ linux-2.6/kernel/lockdep_proc.c	2007-05-23 10:36:45.000000000 +0200
@@ -347,27 +347,28 @@ static const struct file_operations proc
 #ifdef CONFIG_LOCK_STAT
 static int lock_contentions_show(struct seq_file *m, void *v)
 {
-	struct lock_contention_point *cp;
 	char sym[KSYM_SYMBOL_LEN];
 	struct lock_class *class;
-	int r, w, i;
+	struct lock_class_stats stats;
+	int i;
 
 	list_for_each_entry(class, &all_lock_classes, lock_entry) {
-		r = atomic_read(&class->read_contentions);
-		w = atomic_read(&class->write_contentions);
+		stats = lock_stats(class);
+
+		if (stats.read_contentions || stats.write_contentions) {
+			seq_printf(m, "%s: %lu %lu", class->name,
+					stats.write_contentions,
+					stats.read_contentions);
 
-		if (r || w) {
-			seq_printf(m, "%s: %d %d", class->name, w, r);
 			for (i = 0; i < ARRAY_SIZE(class->contention_point);
 					i++) {
-				cp = &class->contention_point[i];
-
-				if (cp->ip == 0)
+				if (class->contention_point[i] == 0)
 					break;
-				sprint_symbol(sym, cp->ip);
-				seq_printf(m, " [%d] [<%p>] %s",
-						atomic_read(&cp->count),
-						(void *)cp->ip, sym);
+				sprint_symbol(sym, class->contention_point[i]);
+				seq_printf(m, " %lu [<%p>] %s",
+					stats.contention_point[i],
+					(void *)class->contention_point[i],
+					sym);
 			}
 			seq_printf(m, "\n");
 		}
@@ -395,14 +396,8 @@ ssize_t lock_contentions_write(struct fi
 		if (c != '0')
 			return count;
 
-		list_for_each_entry(class, &all_lock_classes, lock_entry) {
-			atomic_set(&class->write_contentions, 0);
-			atomic_set(&class->read_contentions, 0);
-			for (i = 0; i < ARRAY_SIZE(class->contention_point);
-					i++) {
-				class->contention_point[i].ip = 0;
-			}
-		}
+		list_for_each_entry(class, &all_lock_classes, lock_entry)
+			clear_lock_stats(class);
 	}
 	return count;
 }
@@ -417,42 +412,24 @@ static const struct file_operations proc
 
 static void print_time(struct seq_file *m, struct lock_time *lt)
 {
-	unsigned long long min, total, max;
-	unsigned long nr;
-
-	__raw_spin_lock(&lt->lock);
-	min = lt->min;
-	total = lt->total;
-	max = lt->max;
-	nr = lt->nr;
-	__raw_spin_unlock(&lt->lock);
-
-	seq_printf(m, " %lu %llu %llu %llu", nr, min, max, total);
-}
-
-static void clear_time(struct lock_time *lt)
-{
-	__raw_spin_lock(&lt->lock);
-	lt->min = 0;
-	lt->total = 0;
-	lt->max = 0;
-	lt->nr = 0;
-	__raw_spin_unlock(&lt->lock);
+	seq_printf(m, " %lu %llu %llu %llu",
+			lt->nr, lt->min, lt->max, lt->total);
 }
 
 static int lock_waittime_show(struct seq_file *m, void *v)
 {
 	struct lock_class *class;
-	int r, w;
+	struct lock_class_stats stats;
 
 	list_for_each_entry(class, &all_lock_classes, lock_entry) {
-		r = atomic_read(&class->read_contentions);
-		w = atomic_read(&class->write_contentions);
+		stats = lock_stats(class);
 
-		if (r || w) {
-			seq_printf(m, "%s: %d %d", class->name, w, r);
-			print_time(m, &class->write_waittime);
-			print_time(m, &class->read_waittime);
+		if (stats.read_contentions || stats.write_contentions) {
+			seq_printf(m, "%s: %lu %lu", class->name,
+					stats.write_contentions,
+					stats.read_contentions);
+			print_time(m, &stats.write_waittime);
+			print_time(m, &stats.read_waittime);
 			seq_printf(m, "\n");
 		}
 	}
@@ -465,30 +442,9 @@ static int lock_waittime_open(struct ino
 	return single_open(file, lock_waittime_show, NULL);
 }
 
-ssize_t lock_waittime_write(struct file *file, const char __user *buf,
-		size_t count, loff_t *ppos)
-{
-	struct lock_class *class;
-	char c;
-
-	if (count) {
-		if (get_user(c, buf))
-			return -EFAULT;
-
-		if (c != '0')
-			return count;
-
-		list_for_each_entry(class, &all_lock_classes, lock_entry) {
-			clear_time(&class->read_waittime);
-			clear_time(&class->write_waittime);
-		}
-	}
-	return count;
-}
-
 static const struct file_operations proc_lock_waittime_operations = {
 	.open		= lock_waittime_open,
-	.write		= lock_waittime_write,
+	.write		= lock_contentions_write,
 	.read		= seq_read,
 	.llseek		= seq_lseek,
 	.release	= seq_release,
@@ -497,16 +453,17 @@ static const struct file_operations proc
 static int lock_holdtime_show(struct seq_file *m, void *v)
 {
 	struct lock_class *class;
-	int r, w;
+	struct lock_class_stats stats;
 
 	list_for_each_entry(class, &all_lock_classes, lock_entry) {
-		r = atomic_read(&class->read_contentions);
-		w = atomic_read(&class->write_contentions);
+		stats = lock_stats(class);
 
-		if (r || w) {
-			seq_printf(m, "%s: %d %d", class->name, w, r);
-			print_time(m, &class->write_holdtime);
-			print_time(m, &class->read_holdtime);
+		if (stats.read_contentions || stats.write_contentions) {
+			seq_printf(m, "%s: %lu %lu", class->name,
+					stats.write_contentions,
+					stats.read_contentions);
+			print_time(m, &stats.write_holdtime);
+			print_time(m, &stats.read_holdtime);
 			seq_printf(m, "\n");
 		}
 	}
@@ -519,30 +476,9 @@ static int lock_holdtime_open(struct ino
 	return single_open(file, lock_holdtime_show, NULL);
 }
 
-ssize_t lock_holdtime_write(struct file *file, const char __user *buf,
-		size_t count, loff_t *ppos)
-{
-	struct lock_class *class;
-	char c;
-
-	if (count) {
-		if (get_user(c, buf))
-			return -EFAULT;
-
-		if (c != '0')
-			return count;
-
-		list_for_each_entry(class, &all_lock_classes, lock_entry) {
-			clear_time(&class->read_holdtime);
-			clear_time(&class->write_holdtime);
-		}
-	}
-	return count;
-}
-
 static const struct file_operations proc_lock_holdtime_operations = {
 	.open		= lock_holdtime_open,
-	.write		= lock_holdtime_write,
+	.write		= lock_contentions_write,
 	.read		= seq_read,
 	.llseek		= seq_lseek,
 	.release	= seq_release,

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/7] lock contention tracking -v2
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
                   ` (6 preceding siblings ...)
  2007-05-23  9:57 ` [PATCH 7/7] lockdep: scalable statistics Peter Zijlstra
@ 2007-05-23 10:33 ` Ingo Molnar
  2007-05-23 21:57   ` Bill Huey
  2007-05-23 15:10 ` [PATCH 8/7] new output format Peter Zijlstra
  8 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2007-05-23 10:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> Use the lockdep infrastructure to track lock contention and other lock 
> statistics.
> 
> It tracks lock contention events, and the first four unique call-sites 
> that encountered contention.
> 
> It also measures lock wait-time and hold-time in nanoseconds. The 
> minimum and maximum times are tracked, as well as a total (which 
> together with the number of event can give the avg).
> 
> All statistics are done per lock class, per write (exclusive state) 
> and per read (shared state).
> 
> The statistics are collected per-cpu, so that the collection overhead 
> is minimized via having no global cachemisses.
> 
> This new lock statistics feature is independent of the lock dependency 
> checking traditionally done by lockdep; it just shares the lock 
> tracking code. It is also possible to enable both and runtime disabled 
> either component - thereby avoiding the O(n^2) lock chain walks for 
> instance.

really nice changes! The wait-time and hold-time changes should make it 
as capable as lockmeter and more: lockmeter only measured spinlocks, 
while your approach covers all lock types (spinlocks, rwlocks and 
mutexes).

The performance enhancements in -v2 should make it much more scalable 
than your first version was. (in fact i think it should be completely 
scalable as the statistics counters are all per-cpu, so there should be 
no cacheline bouncing at all from this)

Also, you've detached it from CONFIG_PROVE_LOCKING so the 
lock-dependency overhead should be gone too.

Looks really good to me!

 Acked-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] lockdep: lock contention tracking
  2007-05-23  9:57 ` [PATCH 3/7] lockdep: lock contention tracking Peter Zijlstra
@ 2007-05-23 14:40   ` Jason Baron
  2007-05-23 15:07     ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Baron @ 2007-05-23 14:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Bill Huey, Steven Rostedt,
	Christoph Hellwig


On Wed, 23 May 2007, Peter Zijlstra wrote:

> Count lock contention events per lock class. Additionally track the first four
> callsites that resulted in the contention.
> 

I think that we need the total number of locking calls, not just the total 
number of *contended* locking calls, in order to make the data 
significant. Same for waittime. Yes, this pollutes the fastpath. In the 
contention case, its one extra addition, and for the waittime, its a call 
the sched_clock(). I don't think that's too much to pay for much more 
meaningful data.

thanks,

-Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] lockdep: lock contention tracking
  2007-05-23 14:40   ` Jason Baron
@ 2007-05-23 15:07     ` Peter Zijlstra
  2007-05-23 16:11       ` Jason Baron
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23 15:07 UTC (permalink / raw)
  To: Jason Baron
  Cc: linux-kernel, Ingo Molnar, Bill Huey, Steven Rostedt,
	Christoph Hellwig

On Wed, 2007-05-23 at 10:40 -0400, Jason Baron wrote:
> On Wed, 23 May 2007, Peter Zijlstra wrote:
> 
> > Count lock contention events per lock class. Additionally track the first four
> > callsites that resulted in the contention.
> > 
> 
> I think that we need the total number of locking calls, not just the total 
> number of *contended* locking calls, in order to make the data 
> significant. Same for waittime. Yes, this pollutes the fastpath. In the 
> contention case, its one extra addition, and for the waittime, its a call 
> the sched_clock(). I don't think that's too much to pay for much more 
> meaningful data.

The holdtime statistics add these numbers.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 8/7] new output format
  2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
                   ` (7 preceding siblings ...)
  2007-05-23 10:33 ` [PATCH 0/7] lock contention tracking -v2 Ingo Molnar
@ 2007-05-23 15:10 ` Peter Zijlstra
  8 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23 15:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Bill Huey, Jason Baron, Steven Rostedt,
	Christoph Hellwig

Perhaps this makes things more clear...

----

<edited output - real one is waaaay to big>

$ cat /proc/lock_stat
T                               class name    contentions   waittime-min   waittime-max waittime-total   acquisitions   holdtime-min   holdtime-max holdtime-total
------------------------------------------------------------------------------------------------------------------------------------------------------------------
X                              dcache_lock            543            478          23751         427570         166757            217         572426       90179147

                               dcache_lock            192          [<ffffffff802a5980>] d_instantiate+0x2a/0xad
                               dcache_lock            215          [<ffffffff8033b7a1>] _atomic_dec_and_lock+0x39/0x58
                               dcache_lock              2          [<ffffffff802a4703>] d_rehash+0x1b/0x44
                               dcache_lock              1          [<ffffffff802a5c53>] d_alloc+0x170/0x1d0

W                            tasklist_lock              9           1322          12870          21729           7543            593         125518       25759730
R                            tasklist_lock             36           1742          10475          76717           6104            417          56540        9121541

                             tasklist_lock             42          [<ffffffff80235240>] do_wait+0x7b/0xaa1
                             tasklist_lock              0          [<ffffffff802406a3>] sys_setpriority+0x4a/0x1ca
                             tasklist_lock              0          [<ffffffff80231786>] copy_process+0x116b/0x1689
                             tasklist_lock              0          [<ffffffff802361ac>] do_exit+0x4d2/0x878

X                       nl_table_wait.lock              0              0              0              0           1133            365         103424         581560
X                      cpufreq_driver_lock              0              0              0              0            122            438          44040         136777
X                       net_todo_run_mutex              0              0              0              0            121            636          42205         112073


Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/lockdep_proc.c |  241 ++++++++++++++++++++++++++------------------------
 1 file changed, 129 insertions(+), 112 deletions(-)

Index: linux-2.6/kernel/lockdep_proc.c
===================================================================
--- linux-2.6.orig/kernel/lockdep_proc.c	2007-05-23 12:08:18.000000000 +0200
+++ linux-2.6/kernel/lockdep_proc.c	2007-05-23 15:14:03.000000000 +0200
@@ -15,6 +15,8 @@
 #include <linux/seq_file.h>
 #include <linux/kallsyms.h>
 #include <linux/debug_locks.h>
+#include <linux/vmalloc.h>
+#include <linux/sort.h>
 #include <asm/uaccess.h>
 #include <asm/div64.h>
 
@@ -345,49 +347,145 @@ static const struct file_operations proc
 };
 
 #ifdef CONFIG_LOCK_STAT
-static int lock_contentions_show(struct seq_file *m, void *v)
-{
-	char sym[KSYM_SYMBOL_LEN];
+
+struct lock_stat_data {
 	struct lock_class *class;
 	struct lock_class_stats stats;
-	int i;
+};
+
+/*
+ * sort on absolute number of contentions
+ */
+int lock_stat_cmp(const void *l, const void *r)
+{
+	const struct lock_stat_data *dl = l, *dr = r;
+	unsigned long nl, nr;
+
+	nl = dl->stats.read_waittime.nr + dl->stats.write_waittime.nr;
+	nr = dr->stats.read_waittime.nr + dr->stats.write_waittime.nr;
+
+	return nr - nl;
+}
+
+static int lock_stat_show(struct seq_file *m, void *v)
+{
+	struct lock_class *class;
+	struct lock_stat_data *stat, *stat_end;
+	struct lock_stat_data *stat_snapshot =
+		vmalloc(sizeof(struct lock_stat_data) * MAX_LOCKDEP_KEYS);
+
+	if (!stat_snapshot) {
+		seq_printf(m, "-ENOMEM\n");
+		return 0;
+	}
+
+	stat_end = stat_snapshot;
 
 	list_for_each_entry(class, &all_lock_classes, lock_entry) {
-		stats = lock_stats(class);
+		stat_end->class = class;
+		stat_end->stats = lock_stats(class);
+		stat_end++;
+	}
 
-		if (stats.read_contentions || stats.write_contentions) {
-			seq_printf(m, "%s: %lu %lu", class->name,
-					stats.write_contentions,
-					stats.read_contentions);
-
-			for (i = 0; i < ARRAY_SIZE(class->contention_point);
-					i++) {
-				if (class->contention_point[i] == 0)
-					break;
-				sprint_symbol(sym, class->contention_point[i]);
-				seq_printf(m, " %lu [<%p>] %s",
-					stats.contention_point[i],
-					(void *)class->contention_point[i],
-					sym);
-			}
-			seq_printf(m, "\n");
+	sort(stat_snapshot, stat_end - stat_snapshot,
+			sizeof(struct lock_stat_data),
+			lock_stat_cmp, NULL);
+
+	seq_printf(m, "%1s %40s %14s %14s %14s %14s %14s %14s %14s %14s\n",
+			"T",
+			"class name",
+			"contentions",
+			"waittime-min",
+			"waittime-max",
+			"waittime-total",
+			"acquisitions",
+			"holdtime-min",
+			"holdtime-max",
+			"holdtime-total");
+	seq_printf(m, "-" "-"
+		      "----------------------------------------" "-"
+	              "--------------" "-"
+	              "--------------" "-"
+	              "--------------" "-"
+	              "--------------" "-"
+	              "--------------" "-"
+	              "--------------" "-"
+	              "--------------" "-"
+	              "--------------" "\n");
+
+	for (stat = stat_snapshot; stat != stat_end; stat++) {
+		struct lock_class_stats *stats = &stat->stats;
+		int i;
+
+		class = stat->class;
+
+		if (stats->write_waittime.nr ||
+				stats->write_holdtime.nr)
+			seq_printf(m, "%c %40s %14lu %14llu %14llu %14llu"
+					" %14lu %14llu %14llu %14llu\n",
+					(stats->read_holdtime.nr ? 'W' : 'X'),
+					class->name,
+					stats->write_waittime.nr,
+					stats->write_waittime.min,
+					stats->write_waittime.max,
+					stats->write_waittime.total,
+					stats->write_holdtime.nr,
+					stats->write_holdtime.min,
+					stats->write_holdtime.max,
+					stats->write_holdtime.total);
+
+		if (stats->read_waittime.nr ||
+				stats->read_holdtime.nr)
+			seq_printf(m, "%c %40s %14lu %14llu %14llu %14llu"
+					" %14lu %14llu %14llu %14llu\n",
+					'R',
+					class->name,
+					stats->read_waittime.nr,
+					stats->read_waittime.min,
+					stats->read_waittime.max,
+					stats->read_waittime.total,
+					stats->read_holdtime.nr,
+					stats->read_holdtime.min,
+					stats->read_holdtime.max,
+					stats->read_holdtime.total);
+
+		for (i = 0; i < ARRAY_SIZE(class->contention_point); i++) {
+			char sym[KSYM_SYMBOL_LEN];
+			char ip[32];
+
+			if (class->contention_point[i] == 0)
+				break;
+
+			if (!i)
+				seq_printf(m, "\n");
+
+			sprint_symbol(sym, class->contention_point[i]);
+			snprintf(ip, sizeof(ip), "[<%p>]",
+					(void *)class->contention_point[i]);
+			seq_printf(m, "  %40s %14lu %29s %s\n",
+					class->name,
+					stats->contention_point[i],
+					ip, sym);
 		}
+		if (i)
+			seq_printf(m, "\n");
 	}
 
+	vfree(stat_snapshot);
+
 	return 0;
 }
 
-static int lock_contentions_open(struct inode *inode, struct file *file)
+static int lock_stat_open(struct inode *inode, struct file *file)
 {
-	return single_open(file, lock_contentions_show, NULL);
+	return single_open(file, lock_stat_show, NULL);
 }
 
-ssize_t lock_contentions_write(struct file *file, const char __user *buf,
+ssize_t lock_stat_write(struct file *file, const char __user *buf,
 		size_t count, loff_t *ppos)
 {
 	struct lock_class *class;
 	char c;
-	int i;
 
 	if (count) {
 		if (get_user(c, buf))
@@ -402,83 +501,9 @@ ssize_t lock_contentions_write(struct fi
 	return count;
 }
 
-static const struct file_operations proc_lock_contentions_operations = {
-	.open		= lock_contentions_open,
-	.write		= lock_contentions_write,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= seq_release,
-};
-
-static void print_time(struct seq_file *m, struct lock_time *lt)
-{
-	seq_printf(m, " %lu %llu %llu %llu",
-			lt->nr, lt->min, lt->max, lt->total);
-}
-
-static int lock_waittime_show(struct seq_file *m, void *v)
-{
-	struct lock_class *class;
-	struct lock_class_stats stats;
-
-	list_for_each_entry(class, &all_lock_classes, lock_entry) {
-		stats = lock_stats(class);
-
-		if (stats.read_contentions || stats.write_contentions) {
-			seq_printf(m, "%s: %lu %lu", class->name,
-					stats.write_contentions,
-					stats.read_contentions);
-			print_time(m, &stats.write_waittime);
-			print_time(m, &stats.read_waittime);
-			seq_printf(m, "\n");
-		}
-	}
-
-	return 0;
-}
-
-static int lock_waittime_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, lock_waittime_show, NULL);
-}
-
-static const struct file_operations proc_lock_waittime_operations = {
-	.open		= lock_waittime_open,
-	.write		= lock_contentions_write,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= seq_release,
-};
-
-static int lock_holdtime_show(struct seq_file *m, void *v)
-{
-	struct lock_class *class;
-	struct lock_class_stats stats;
-
-	list_for_each_entry(class, &all_lock_classes, lock_entry) {
-		stats = lock_stats(class);
-
-		if (stats.read_contentions || stats.write_contentions) {
-			seq_printf(m, "%s: %lu %lu", class->name,
-					stats.write_contentions,
-					stats.read_contentions);
-			print_time(m, &stats.write_holdtime);
-			print_time(m, &stats.read_holdtime);
-			seq_printf(m, "\n");
-		}
-	}
-
-	return 0;
-}
-
-static int lock_holdtime_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, lock_holdtime_show, NULL);
-}
-
-static const struct file_operations proc_lock_holdtime_operations = {
-	.open		= lock_holdtime_open,
-	.write		= lock_contentions_write,
+static const struct file_operations proc_lock_stat_operations = {
+	.open		= lock_stat_open,
+	.write		= lock_stat_write,
 	.read		= seq_read,
 	.llseek		= seq_lseek,
 	.release	= seq_release,
@@ -498,17 +523,9 @@ static int __init lockdep_proc_init(void
 		entry->proc_fops = &proc_lockdep_stats_operations;
 
 #ifdef CONFIG_LOCK_STAT
-	entry = create_proc_entry("lock_contentions", S_IRUSR, NULL);
-	if (entry)
-		entry->proc_fops = &proc_lock_contentions_operations;
-
-	entry = create_proc_entry("lock_waittime", S_IRUSR, NULL);
-	if (entry)
-		entry->proc_fops = &proc_lock_waittime_operations;
-
-	entry = create_proc_entry("lock_holdtime", S_IRUSR, NULL);
+	entry = create_proc_entry("lock_stat", S_IRUSR, NULL);
 	if (entry)
-		entry->proc_fops = &proc_lock_holdtime_operations;
+		entry->proc_fops = &proc_lock_stat_operations;
 #endif
 
 	return 0;



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] lockdep: lock contention tracking
  2007-05-23 15:07     ` Peter Zijlstra
@ 2007-05-23 16:11       ` Jason Baron
  2007-05-23 17:23         ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Baron @ 2007-05-23 16:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Bill Huey, Steven Rostedt,
	Christoph Hellwig


On Wed, 23 May 2007, Peter Zijlstra wrote:

> On Wed, 2007-05-23 at 10:40 -0400, Jason Baron wrote:
> > On Wed, 23 May 2007, Peter Zijlstra wrote:
> > 
> > > Count lock contention events per lock class. Additionally track the first four
> > > callsites that resulted in the contention.
> > > 
> > 
> > I think that we need the total number of locking calls, not just the total 
> > number of *contended* locking calls, in order to make the data 
> > significant. Same for waittime. Yes, this pollutes the fastpath. In the 
> > contention case, its one extra addition, and for the waittime, its a call 
> > the sched_clock(). I don't think that's too much to pay for much more 
> > meaningful data.
> 
> The holdtime statistics add these numbers.
> 

ok, i see what you are saying...however, the 'waittime' statistic as 
implemented, is only recorded when we don't get the lock immediately. 
Thus, it's really measuring the waittime when there is lock contention. I 
understand that in the non-contended case we are only talking a a few 
cycles, but the fact that the non-contended case is not counted as another 
waittime of even zero length (so no measurement is required), might skew 
the stats a bit. For examples, if there was a lock that was almost never 
contended but one time happened to be contended for a long time, its 
average wait time would look really high.

thanks,

-Jason 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] lockdep: lock contention tracking
  2007-05-23 16:11       ` Jason Baron
@ 2007-05-23 17:23         ` Peter Zijlstra
  2007-05-23 17:24           ` Jason Baron
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2007-05-23 17:23 UTC (permalink / raw)
  To: Jason Baron
  Cc: linux-kernel, Ingo Molnar, Bill Huey, Steven Rostedt,
	Christoph Hellwig

On Wed, 2007-05-23 at 12:11 -0400, Jason Baron wrote:
> On Wed, 23 May 2007, Peter Zijlstra wrote:
> 
> > On Wed, 2007-05-23 at 10:40 -0400, Jason Baron wrote:
> > > On Wed, 23 May 2007, Peter Zijlstra wrote:
> > > 
> > > > Count lock contention events per lock class. Additionally track the first four
> > > > callsites that resulted in the contention.
> > > > 
> > > 
> > > I think that we need the total number of locking calls, not just the total 
> > > number of *contended* locking calls, in order to make the data 
> > > significant. Same for waittime. Yes, this pollutes the fastpath. In the 
> > > contention case, its one extra addition, and for the waittime, its a call 
> > > the sched_clock(). I don't think that's too much to pay for much more 
> > > meaningful data.
> > 
> > The holdtime statistics add these numbers.
> > 
> 
> ok, i see what you are saying...however, the 'waittime' statistic as 
> implemented, is only recorded when we don't get the lock immediately. 
> Thus, it's really measuring the waittime when there is lock contention. I 
> understand that in the non-contended case we are only talking a a few 
> cycles, but the fact that the non-contended case is not counted as another 
> waittime of even zero length (so no measurement is required), might skew 
> the stats a bit. For examples, if there was a lock that was almost never 
> contended but one time happened to be contended for a long time, its 
> average wait time would look really high.

I'm not seeing how or why this is a problem. The number of contentions
is reported on the same line, so it should be obvious.

Your definition of wait-time is also obtainable from the numbers, one
would just divide waittime-total by acquisitions, instead of
contentions.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] lockdep: lock contention tracking
  2007-05-23 17:23         ` Peter Zijlstra
@ 2007-05-23 17:24           ` Jason Baron
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Baron @ 2007-05-23 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Bill Huey, Steven Rostedt,
	Christoph Hellwig


On Wed, 23 May 2007, Peter Zijlstra wrote:

> On Wed, 2007-05-23 at 12:11 -0400, Jason Baron wrote:
> > On Wed, 23 May 2007, Peter Zijlstra wrote:
> > 
> > > On Wed, 2007-05-23 at 10:40 -0400, Jason Baron wrote:
> > > > On Wed, 23 May 2007, Peter Zijlstra wrote:
> > > > 
> > > > > Count lock contention events per lock class. Additionally track the first four
> > > > > callsites that resulted in the contention.
> > > > > 
> > > > 
> > > > I think that we need the total number of locking calls, not just the total 
> > > > number of *contended* locking calls, in order to make the data 
> > > > significant. Same for waittime. Yes, this pollutes the fastpath. In the 
> > > > contention case, its one extra addition, and for the waittime, its a call 
> > > > the sched_clock(). I don't think that's too much to pay for much more 
> > > > meaningful data.
> > > 
> > > The holdtime statistics add these numbers.
> > > 
> > 
> > ok, i see what you are saying...however, the 'waittime' statistic as 
> > implemented, is only recorded when we don't get the lock immediately. 
> > Thus, it's really measuring the waittime when there is lock contention. I 
> > understand that in the non-contended case we are only talking a a few 
> > cycles, but the fact that the non-contended case is not counted as another 
> > waittime of even zero length (so no measurement is required), might skew 
> > the stats a bit. For examples, if there was a lock that was almost never 
> > contended but one time happened to be contended for a long time, its 
> > average wait time would look really high.
> 
> I'm not seeing how or why this is a problem. The number of contentions
> is reported on the same line, so it should be obvious.
> 
> Your definition of wait-time is also obtainable from the numbers, one
> would just divide waittime-total by acquisitions, instead of
> contentions.
> 

agreed, I just want to point out that under my definition of waitime, I 
would have to make the assumption that the waittime is 0 for a lock that 
is acquired without fallback to the slowpath. For a spinlock acquisition, 
this is likely a fair assumption, however the trylock path for semaphores 
can be longer.

Acked-by: Jason Baron <jbaron@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/7] lock contention tracking -v2
  2007-05-23 10:33 ` [PATCH 0/7] lock contention tracking -v2 Ingo Molnar
@ 2007-05-23 21:57   ` Bill Huey
  0 siblings, 0 replies; 16+ messages in thread
From: Bill Huey @ 2007-05-23 21:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, Jason Baron, Steven Rostedt,
	Christoph Hellwig, Bill Huey (hui)

On Wed, May 23, 2007 at 12:33:11PM +0200, Ingo Molnar wrote:
> * Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
...
> > It also measures lock wait-time and hold-time in nanoseconds. The 
> > minimum and maximum times are tracked, as well as a total (which 
> > together with the number of event can give the avg).
> > 
> > All statistics are done per lock class, per write (exclusive state) 
> > and per read (shared state).
> > 
> > The statistics are collected per-cpu, so that the collection overhead 
> > is minimized via having no global cachemisses.
...
> really nice changes! The wait-time and hold-time changes should make it 
> as capable as lockmeter and more: lockmeter only measured spinlocks, 
> while your approach covers all lock types (spinlocks, rwlocks and 
> mutexes).
> 
> The performance enhancements in -v2 should make it much more scalable 
> than your first version was. (in fact i think it should be completely 
> scalable as the statistics counters are all per-cpu, so there should be 
> no cacheline bouncing at all from this)

per cpu is pretty important since you can potentially hit that logic more
often with your wait-time code. You don't want to effect the actual
measurement with the measurement code. It's that uncertainty principal thing.

It is looking pretty good. :) You might like to pretty the output even more,
but it's pretty usable as is.

bill


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-05-23 21:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-23  9:57 [PATCH 0/7] lock contention tracking -v2 Peter Zijlstra
2007-05-23  9:57 ` [PATCH 1/7] fix raw_spinlock_t vs lockdep Peter Zijlstra
2007-05-23  9:57 ` [PATCH 2/7] lockdep: isolate lock graph walking Peter Zijlstra
2007-05-23  9:57 ` [PATCH 3/7] lockdep: lock contention tracking Peter Zijlstra
2007-05-23 14:40   ` Jason Baron
2007-05-23 15:07     ` Peter Zijlstra
2007-05-23 16:11       ` Jason Baron
2007-05-23 17:23         ` Peter Zijlstra
2007-05-23 17:24           ` Jason Baron
2007-05-23  9:57 ` [PATCH 4/7] lockdep: add waittime to the lock statistics Peter Zijlstra
2007-05-23  9:57 ` [PATCH 5/7] lockdep: add holdtime " Peter Zijlstra
2007-05-23  9:57 ` [PATCH 6/7] lockdep: runtime configure prove_locking and lock_stat Peter Zijlstra
2007-05-23  9:57 ` [PATCH 7/7] lockdep: scalable statistics Peter Zijlstra
2007-05-23 10:33 ` [PATCH 0/7] lock contention tracking -v2 Ingo Molnar
2007-05-23 21:57   ` Bill Huey
2007-05-23 15:10 ` [PATCH 8/7] new output format Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox