All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch] support of lock profiling in Xen
@ 2009-10-08  7:48 Juergen Gross
  2009-10-08  8:15 ` Keir Fraser
  2009-10-08  8:21 ` Keir Fraser
  0 siblings, 2 replies; 17+ messages in thread
From: Juergen Gross @ 2009-10-08  7:48 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

Hi,

attached is a patch to support lock profiling in the Xen hypervisor.
The feature is switchable at compile time (like software performance counters)
in Rules.mk. If lock profiling is not activated, code and data structures are
the same as without the patch. With activated lock profiling some statistical
data is added to each spinlock_t structure and locking/unlocking keeps track
for each lock of:
- number of successful locks
- cumulated time of lock held
- number of locking calls leading to wait cycles
- cumulated waiting time

The profile data may be obtained either via hotkey from the console or with
the new tool xenlockprof.

To be able to find all locks for printing out the profile data, a linked list
with the locks is built up. Locks defined via DEFINE_SPINLOCK are added to
this list automatically, locks in dynamically allocated data structures have
to be added (and removed) explicitly to the queue to be able to print the
profile data.
Adding is done via spin_lock_init_prof (there are two additional calls with
one or two index parameters to be able to distinguish locks e.g. per domid and
vcpu-nr), removing is done via spin_lock_destroy_prof (may be called for a
completely zeroed structure as well).

I'm not completely satisfied with the solution for the dynamically initialized
locks, but I had no better idea in the first run.
Another enhancement would be to expand the profiling to rw-locks as well, but
this would have required a rewrite of the lock routines using try_lock like
for spinlocks. I could do this if the lock profiling is accepted.

Comments welcome :-)


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

[-- Attachment #2: lock_prof.patch --]
[-- Type: text/x-patch, Size: 22419 bytes --]

Support of lock profiling in Xen

Signed-off-by: juergen.gross@ts.fujitsu.com

diff -r 729567f615c1 tools/libxc/xc_misc.c
--- a/tools/libxc/xc_misc.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/tools/libxc/xc_misc.c	Thu Oct 08 09:18:52 2009 +0200
@@ -121,6 +121,30 @@
     return rc;
 }
 
+int xc_lockprof_control(int xc_handle,
+                        uint32_t opcode,
+                        uint32_t *n_elems,
+                        uint64_t *time,
+                        xc_lockprof_data_t *data)
+{
+    int rc;
+    DECLARE_SYSCTL;
+
+    sysctl.cmd = XEN_SYSCTL_lockprof_op;
+    sysctl.u.lockprof_op.cmd = opcode;
+    sysctl.u.lockprof_op.max_elem = n_elems ? *n_elems : 0;
+    set_xen_guest_handle(sysctl.u.lockprof_op.data, data);
+
+    rc = do_sysctl(xc_handle, &sysctl);
+
+    if (n_elems)
+        *n_elems = sysctl.u.lockprof_op.nr_elem;
+    if (time)
+        *time = sysctl.u.lockprof_op.time;
+
+    return rc;
+}
+
 int xc_getcpuinfo(int xc_handle, int max_cpus,
                   xc_cpuinfo_t *info, int *nr_cpus)
 {
diff -r 729567f615c1 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/tools/libxc/xenctrl.h	Thu Oct 08 09:18:52 2009 +0200
@@ -700,6 +700,14 @@
                      int *nbr_desc,
                      int *nbr_val);
 
+typedef xen_sysctl_lockprof_data_t xc_lockprof_data_t;
+/* IMPORTANT: The caller is responsible for mlock()'ing the @data array. */
+int xc_lockprof_control(int xc_handle,
+                        uint32_t opcode,
+                        uint32_t *n_elems,
+                        uint64_t *time,
+                        xc_lockprof_data_t *data);
+
 /**
  * Memory maps a range within one domain to a local address range.  Mappings
  * should be unmapped with munmap and should follow the same rules as mmap
diff -r 729567f615c1 tools/misc/Makefile
--- a/tools/misc/Makefile	Wed Oct 07 16:29:03 2009 +0100
+++ b/tools/misc/Makefile	Thu Oct 08 09:18:52 2009 +0200
@@ -10,7 +10,7 @@
 
 HDRS     = $(wildcard *.h)
 
-TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat
+TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof
 TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@
 INSTALL_BIN-$(CONFIG_X86) += xen-detect
 INSTALL_BIN := $(INSTALL_BIN-y)
 
-INSTALL_SBIN-y := xm xen-bugtool xen-python-path xend xenperf xsview xenpm xen-tmem-list-parse gtraceview gtracestat
+INSTALL_SBIN-y := xm xen-bugtool xen-python-path xend xenperf xsview xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof
 INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -49,7 +49,7 @@
 %.o: %.c $(HDRS) Makefile
 	$(CC) -c $(CFLAGS) -o $@ $<
 
-xen-hvmctx xenperf xenpm gtracestat: %: %.o Makefile
+xen-hvmctx xenperf xenpm gtracestat xenlockprof: %: %.o Makefile
 	$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) $(LDFLAGS_libxenctrl)
 
 gtraceview: %: %.o Makefile
diff -r 729567f615c1 tools/misc/xenlockprof.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/misc/xenlockprof.c	Thu Oct 08 09:18:52 2009 +0200
@@ -0,0 +1,129 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ ****************************************************************************
+ * (C) 2009 - Juergen Gross - Fujitsu Technology Solutions
+ ****************************************************************************
+ *
+ *        File: xenlockprof.c
+ *      Author: Juergen Gross (juergen.gross@ts.fujitsu.com)
+ *        Date: Oct 2009
+ * 
+ * Description: 
+ */
+
+#include <xenctrl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static int lock_pages(void *addr, size_t len)
+{
+    int e = 0;
+#ifndef __sun__
+    e = mlock(addr, len);
+#endif
+    return (e);
+}
+
+static void unlock_pages(void *addr, size_t len)
+{
+#ifndef __sun__
+    munlock(addr, len);
+#endif
+}
+
+int main(int argc, char *argv[])
+{
+    int                xc_handle;
+    uint32_t           i, j, n;
+    uint64_t           time;
+    double             l, b, sl, sb;
+    char               name[60];
+    xc_lockprof_data_t *data;
+
+    if ((argc > 2) || ((argc == 2) && (strcmp(argv[1], "-r") != 0)))
+    {
+        printf("%s: [-r]\n", argv[0]);
+        printf("no args: print lock profile data\n");
+        printf("    -r : reset profile data\n");
+        return 1;
+    }
+
+    if ((xc_handle = xc_interface_open()) == -1)
+    {
+        fprintf(stderr, "Error opening xc interface: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    if (argc > 1)
+    {
+        if (xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_reset, NULL,
+                                NULL, NULL) != 0)
+        {
+            fprintf(stderr, "Error reseting profile data: %d (%s)\n",
+                    errno, strerror(errno));
+            return 1;
+        }
+        return 1;
+    }
+
+    n = 0;
+    if (xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_query, &n,
+                            NULL, NULL) != 0)
+    {
+        fprintf(stderr, "Error getting number of profile records: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    n += 32;    /* just to be sure */
+    data = malloc(sizeof(*data) * n);
+    if ((data == NULL) || (lock_pages(data, sizeof(*data) * n) != 0))
+    {
+        fprintf(stderr, "Could not alloc or lock buffers: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    i = n;
+    if ( xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_query, &i,
+                             &time, data) != 0)
+    {
+        fprintf(stderr, "Error getting profile records: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    unlock_pages(data, sizeof(*data) * n);
+
+    if (i > n)
+    {
+        printf("data incomplete, %d records are missing!\n\n", i - n);
+        i = n;
+    }
+    sl = 0;
+    sb = 0;
+    for (j = 0; j < i; j++)
+    {
+        if (data[j].idx2 >= 0)
+            sprintf(name, data[j].name, data[j].idx1, data[j].idx2);
+        else if (data[j].idx1 >= 0)
+            sprintf(name, data[j].name, data[j].idx1);
+        else
+            sprintf(name, data[j].name);
+        l = (double)(data[j].lock_time) / 1E+09;
+        b = (double)(data[j].block_time) / 1E+09;
+        sl += l;
+        sb += b;
+        printf("%-50s: lock:%12ld(%20.9fs), block:%12ld(%20.9fs)\n",
+            name, data[j].lock_cnt, l, data[j].block_cnt, b);
+    }
+    l = (double)time / 1E+09;
+    printf("total profiling time: %20.9fs\n", l);
+    printf("total locked time:    %20.9fs\n", sl);
+    printf("total blocked time:   %20.9fs\n", sb);
+
+    return 0;
+}
diff -r 729567f615c1 xen/Rules.mk
--- a/xen/Rules.mk	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/Rules.mk	Thu Oct 08 09:18:52 2009 +0200
@@ -6,6 +6,7 @@
 verbose       ?= n
 perfc         ?= n
 perfc_arrays  ?= n
+lock_profile  ?= n
 crash_debug   ?= n
 frame_pointer ?= n
 
@@ -49,6 +50,7 @@
 CFLAGS-$(crash_debug)   += -DCRASH_DEBUG
 CFLAGS-$(perfc)         += -DPERF_COUNTERS
 CFLAGS-$(perfc_arrays)  += -DPERF_ARRAYS
+CFLAGS-$(lock_profile)  += -DLOCK_PROFILE
 CFLAGS-$(frame_pointer) += -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER
 
 ifneq ($(max_phys_cpus),)
diff -r 729567f615c1 xen/arch/x86/xen.lds.S
--- a/xen/arch/x86/xen.lds.S	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/arch/x86/xen.lds.S	Thu Oct 08 09:18:52 2009 +0200
@@ -62,6 +62,13 @@
        *(.data.read_mostly)
   } :text
 
+#ifdef LOCK_PROFILE
+  . = ALIGN(32);
+  __lock_profile_start = .;
+  .lockprofile.data : { *(.lockprofile.data) } :text
+  __lock_profile_end = .;
+#endif
+
   . = ALIGN(4096);             /* Init code and data */
   __init_begin = .;
   .init.text : {
diff -r 729567f615c1 xen/common/keyhandler.c
--- a/xen/common/keyhandler.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/keyhandler.c	Thu Oct 08 09:18:52 2009 +0200
@@ -347,6 +347,20 @@
 };
 #endif
 
+#ifdef LOCK_PROFILE
+extern void spinlock_profile_printall(unsigned char key);
+static struct keyhandler spinlock_printall_keyhandler = {
+    .diagnostic = 1,
+    .u.fn = spinlock_profile_printall,
+    .desc = "print lock profile info"
+};
+extern void spinlock_profile_reset(unsigned char key);
+static struct keyhandler spinlock_reset_keyhandler = {
+    .u.fn = spinlock_profile_reset,
+    .desc = "reset lock profile info"
+};
+#endif
+
 static void run_all_nonirq_keyhandlers(unsigned long unused)
 {
     /* Fire all the non-IRQ-context diagnostic keyhandlers */
@@ -428,6 +442,12 @@
     register_keyhandler('p', &perfc_printall_keyhandler);
     register_keyhandler('P', &perfc_reset_keyhandler);
 #endif
+
+#ifdef LOCK_PROFILE
+    register_keyhandler('l', &spinlock_printall_keyhandler);
+    register_keyhandler('L', &spinlock_reset_keyhandler);
+#endif
+
 }
 
 /*
diff -r 729567f615c1 xen/common/spinlock.c
--- a/xen/common/spinlock.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/spinlock.c	Thu Oct 08 09:18:52 2009 +0200
@@ -1,7 +1,11 @@
+#include <xen/lib.h>
 #include <xen/config.h>
 #include <xen/irq.h>
 #include <xen/smp.h>
+#include <xen/time.h>
 #include <xen/spinlock.h>
+#include <xen/guest_access.h>
+#include <public/sysctl.h>
 #include <asm/processor.h>
 
 #ifndef NDEBUG
@@ -41,56 +45,97 @@
 
 #endif
 
+#ifdef LOCK_PROFILE
+
+#define LOCK_PROFILE_REL                                          \
+    lock->profile.time_hold += NOW() - lock->profile.time_locked; \
+    lock->profile.lock_cnt++
+#define LOCK_PROFILE_VAR    s_time_t block = 0
+#define LOCK_PROFILE_BLOCK  block = block ? : NOW();
+#define LOCK_PROFILE_GOT                                          \
+    lock->profile.time_locked = NOW();                            \
+    if (block)                                                    \
+    {                                                             \
+        lock->profile.time_block += lock->profile.time_locked - block; \
+        lock->profile.block_cnt++;                                \
+    }
+
+#else
+
+#define LOCK_PROFILE_REL
+#define LOCK_PROFILE_VAR
+#define LOCK_PROFILE_BLOCK
+#define LOCK_PROFILE_GOT
+
+#endif
+
 void _spin_lock(spinlock_t *lock)
 {
+    LOCK_PROFILE_VAR;
+
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
+    {
+        LOCK_PROFILE_BLOCK;
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
+    }
+    LOCK_PROFILE_GOT;
 }
 
 void _spin_lock_irq(spinlock_t *lock)
 {
+    LOCK_PROFILE_VAR;
+
     ASSERT(local_irq_is_enabled());
     local_irq_disable();
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
     {
+        LOCK_PROFILE_BLOCK;
         local_irq_enable();
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
         local_irq_disable();
     }
+    LOCK_PROFILE_GOT;
 }
 
 unsigned long _spin_lock_irqsave(spinlock_t *lock)
 {
     unsigned long flags;
+    LOCK_PROFILE_VAR;
+
     local_irq_save(flags);
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
     {
+        LOCK_PROFILE_BLOCK;
         local_irq_restore(flags);
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
         local_irq_save(flags);
     }
+    LOCK_PROFILE_GOT;
     return flags;
 }
 
 void _spin_unlock(spinlock_t *lock)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
 }
 
 void _spin_unlock_irq(spinlock_t *lock)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
     local_irq_enable();
 }
 
 void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
     local_irq_restore(flags);
 }
@@ -104,13 +149,32 @@
 int _spin_trylock(spinlock_t *lock)
 {
     check_lock(&lock->debug);
+#ifndef LOCK_PROFILE
     return _raw_spin_trylock(&lock->raw);
+#else
+    if (!_raw_spin_trylock(&lock->raw)) return 0;
+    lock->profile.time_locked = NOW();
+    return 1;
+#endif
 }
 
 void _spin_barrier(spinlock_t *lock)
 {
+#ifdef LOCK_PROFILE
+    s_time_t block = NOW();
+    u64      loop = 0;
+
+    check_lock(&lock->debug);
+    do { mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
+    if (loop > 1)
+    {
+        lock->profile.time_block += NOW() - block;
+        lock->profile.block_cnt++;
+    }
+#else
     check_lock(&lock->debug);
     do { mb(); } while ( _raw_spin_is_locked(&lock->raw) );
+#endif
     mb();
 }
 
@@ -248,3 +312,139 @@
     check_lock(&lock->debug);
     return _raw_rw_is_write_locked(&lock->raw);
 }
+
+#ifdef LOCK_PROFILE
+extern struct lock_profile *__lock_profile_start;
+extern struct lock_profile *__lock_profile_end;
+
+static s_time_t lock_profile_start = 0;
+static struct lock_profile *lock_profile_anc = NULL;
+static int lock_profile_n = 0;
+static spinlock_t lock_profile_lock = SPIN_LOCK_UNLOCKED;
+
+void spinlock_profile_printall(unsigned char key)
+{
+    s_time_t now = NOW();
+    s_time_t diff;
+    struct lock_profile *q;
+
+    diff = now - lock_profile_start;
+    printk("Xen lock profile info SHOW  (now = %08X:%08X, "
+        "total = %08X:%08X)\n", (u32)(now>>32), (u32)now,
+        (u32)(diff>>32), (u32)diff);
+    spin_lock(&lock_profile_lock);
+    for (q = lock_profile_anc; q != NULL; q = q->next)
+    {
+        if (q->idx1 == -1) printk(q->name);
+        else if (q->idx2 == -1) printk(q->name, q->idx1);
+        else printk(q->name, q->idx1, q->idx2);
+        printk("\n  lock:%12ld(%08X:%08X), block:%12ld(%08X:%08X)\n",
+            q->lock_cnt, (u32)(q->time_hold >> 32), (u32)q->time_hold,
+            q->block_cnt, (u32)(q->time_block >> 32), (u32)q->time_block);
+    }
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+void spinlock_profile_reset(unsigned char key)
+{
+    s_time_t now = NOW();
+    struct lock_profile *q;
+
+    if ( key != '\0' )
+        printk("Xen lock profile info RESET (now = %08X:%08X)\n",
+            (u32)(now>>32), (u32)now);
+    lock_profile_start = now;
+    spin_lock(&lock_profile_lock);
+    for (q = lock_profile_anc; q != NULL; q = q->next)
+    {
+        q->lock_cnt =0;
+        q->block_cnt = 0;
+        q->time_hold = 0;
+        q->time_block = 0;
+    }
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+void spinlock_profile_add(struct lock_profile *p)
+{
+    spin_lock(&lock_profile_lock);
+    lock_profile_n++;
+    p->prev = NULL;
+    if (lock_profile_anc != NULL) lock_profile_anc->prev = p;
+    p->next = lock_profile_anc;
+    lock_profile_anc = p;
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+void spinlock_profile_rm(struct lock_profile *p)
+{
+    if (p->name == NULL) return;
+    spin_lock(&lock_profile_lock);
+    if (p->next != NULL) p->next->prev = p->prev;
+    if (p->prev != NULL) p->prev->next = p->next;
+    else lock_profile_anc = p->next;
+    lock_profile_n--;
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+/* Dom0 control of lock profiling */
+int spinlock_profile_control(xen_sysctl_lockprof_op_t *pc)
+{
+    int rc;
+    int i;
+    struct lock_profile *q;
+    xen_sysctl_lockprof_data_t elem;
+
+    rc = 0;
+    switch (pc->cmd)
+    {
+    case XEN_SYSCTL_LOCKPROF_reset:
+        spinlock_profile_reset('\0');
+        break;
+    case XEN_SYSCTL_LOCKPROF_query:
+        spin_lock(&lock_profile_lock);
+        q = lock_profile_anc;
+        for (i = 0; i < pc->max_elem; i++)
+        {
+            if (q == NULL) break;
+            safe_strcpy(elem.name, q->name);
+            elem.idx1 = q->idx1;
+            elem.idx2 = q->idx2;
+            elem.lock_cnt = q->lock_cnt;
+            elem.block_cnt = q->block_cnt;
+            elem.lock_time = q->time_hold;
+            elem.block_time = q->time_block;
+            if (copy_to_guest_offset(pc->data, i, &elem, 1))
+            {
+                rc = -EFAULT;
+                break;
+            }
+            q = q->next;
+        }
+        pc->nr_elem = lock_profile_n;
+        pc->time = NOW() - lock_profile_start;
+        spin_unlock(&lock_profile_lock);
+        break;
+    default:
+        rc = -EINVAL;
+        break;
+    }
+    return rc;
+}
+
+static int __init spinlock_profile_init(void)
+{
+    struct lock_profile **data;
+
+    for (data = &__lock_profile_start; data < &__lock_profile_end; data++)
+    {
+        spinlock_profile_add(*data);
+    }
+    return 0;
+}
+__initcall(spinlock_profile_init);
+#endif
diff -r 729567f615c1 xen/common/sysctl.c
--- a/xen/common/sysctl.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/sysctl.c	Thu Oct 08 09:18:52 2009 +0200
@@ -29,6 +29,9 @@
 
 extern long arch_do_sysctl(
     struct xen_sysctl *op, XEN_GUEST_HANDLE(xen_sysctl_t) u_sysctl);
+#ifdef LOCK_PROFILE
+extern int spinlock_profile_control(xen_sysctl_lockprof_op_t *pc);
+#endif
 
 long do_sysctl(XEN_GUEST_HANDLE(xen_sysctl_t) u_sysctl)
 {
@@ -144,6 +147,15 @@
     break;
 #endif
 
+#ifdef LOCK_PROFILE
+    case XEN_SYSCTL_lockprof_op:
+    {
+        ret = spinlock_profile_control(&op->u.lockprof_op);
+        if ( copy_to_guest(u_sysctl, op, 1) )
+            ret = -EFAULT;
+    }
+    break;
+#endif
     case XEN_SYSCTL_debug_keys:
     {
         char c;
diff -r 729567f615c1 xen/include/public/sysctl.h
--- a/xen/include/public/sysctl.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/include/public/sysctl.h	Thu Oct 08 09:18:52 2009 +0200
@@ -454,6 +454,34 @@
 
 #define PG_OFFLINE_OWNER_SHIFT 16
 
+#define XEN_SYSCTL_lockprof_op       11
+/* Sub-operations: */
+#define XEN_SYSCTL_LOCKPROF_reset 1   /* Reset all profile data to zero. */
+#define XEN_SYSCTL_LOCKPROF_query 2   /* Get lock profile information. */
+struct xen_sysctl_lockprof_data {
+    char     name[40];     /* lock name (may include up to 2 %d specifiers) */
+    int32_t  idx1;         /* index 1 (e.g. domain id) or -1 */
+    int32_t  idx2;         /* index 2 (e.g. vcpu nr) or -1 */
+    uint64_t lock_cnt;     /* # of locking succeeded */
+    uint64_t block_cnt;    /* # of wait for lock */
+    uint64_t lock_time;    /* nsecs lock held */
+    uint64_t block_time;   /* nsecs waited for lock */
+};
+typedef struct xen_sysctl_lockprof_data xen_sysctl_lockprof_data_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_data_t);
+struct xen_sysctl_lockprof_op {
+    /* IN variables. */
+    uint32_t       cmd;               /* XEN_SYSCTL_LOCKPROF_??? */
+    uint32_t       max_elem;          /* size of output buffer */
+    /* OUT variables (query only). */
+    uint32_t       nr_elem;           /* number of elements available */
+    uint64_t       time;              /* nsecs of profile measurement */
+    /* profile information (or NULL) */
+    XEN_GUEST_HANDLE_64(xen_sysctl_lockprof_data_t) data;
+};
+typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
+
 struct xen_sysctl {
     uint32_t cmd;
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
@@ -471,6 +499,7 @@
         struct xen_sysctl_cpu_hotplug       cpu_hotplug;
         struct xen_sysctl_pm_op             pm_op;
         struct xen_sysctl_page_offline_op   page_offline;
+        struct xen_sysctl_lockprof_op       lockprof_op;
         uint8_t                             pad[128];
     } u;
 };
diff -r 729567f615c1 xen/include/xen/spinlock.h
--- a/xen/include/xen/spinlock.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/include/xen/spinlock.h	Thu Oct 08 09:18:52 2009 +0200
@@ -19,17 +19,63 @@
 #define spin_debug_disable() ((void)0)
 #endif
 
+#ifdef LOCK_PROFILE
+/*
+    lock profiling on: ALL locks are subject to profiling, data is available
+    only for locks declared or inited with profiling support
+*/
+struct lock_profile {
+    struct lock_profile *next;   /* forward link in profile queue */
+    struct lock_profile *prev;   /* backward link in profile queue */
+    char *name;                  /* lock name, may contain %d for idx */
+    u64  lock_cnt;               /* # of complete locking ops */
+    u64  block_cnt;              /* # of complete wait for lock */
+    s64  time_locked;            /* system time of last locking */
+    s64  time_hold;              /* cumulated lock time */
+    s64  time_block;             /* cumulated wait time */
+    int  idx1;                   /* index to print (e.g. domain id) or -1 */
+    int  idx2;                   /* index to print (e.g. vcpu nr) or -1 */
+};
+
+#define _LOCK_PROFILE(name, i1, i2) { 0, 0, name, 0, 0, 0, 0, 0, i1, i2 }
+#define _LOCK_PROFILE_ANC(name) ;                                           \
+    static struct lock_profile * __lock_profile_##name                      \
+    __attribute_used__ __attribute__ ((__section__(".lockprofile.data"))) = \
+    &name.profile
+
+void spinlock_profile_add(struct lock_profile *p);
+void spinlock_profile_rm(struct lock_profile *p);
+
+#else
+struct lock_profile { };
+#define _LOCK_PROFILE(name, i1, i2) { }
+#define _LOCK_PROFILE_ANC(name)
+#define spinlock_profile_add(x) ((void)0)
+#define spinlock_profile_rm(x) ((void)0)
+#endif
+
 typedef struct {
     raw_spinlock_t raw;
     u16 recurse_cpu:12;
     u16 recurse_cnt:4;
     struct lock_debug debug;
+    struct lock_profile profile;
 } spinlock_t;
 
 
-#define SPIN_LOCK_UNLOCKED { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG }
-#define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED
+#define SPIN_LOCK_UNLOCKED { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG, \
+                             _LOCK_PROFILE(NULL, -1, -1) }
+#define SPIN_LOCK_UNLOCKED_PROF(n, i1, i2) { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, \
+                             0, _LOCK_DEBUG, _LOCK_PROFILE(n, i1, i2) }
+#define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED_PROF(#l, -1, -1) \
+                             _LOCK_PROFILE_ANC(l)
 #define spin_lock_init(l) (*(l) = (spinlock_t)SPIN_LOCK_UNLOCKED)
+#define spin_lock_init_prof2(l, n, i1, i2) {                                  \
+    *(l) = (spinlock_t)SPIN_LOCK_UNLOCKED_PROF(n, i1, i2);                    \
+    spinlock_profile_add(&(l)->profile);  }
+#define spin_lock_init_prof1(l, n, i1)   spin_lock_init_prof2(l, n, i1, -1)
+#define spin_lock_init_prof(l, n)        spin_lock_init_prof2(l, n, -1, -1)
+#define spin_lock_destroy_prof(l)        spinlock_profile_rm(&(l)->profile)
 
 typedef struct {
     raw_rwlock_t raw;

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  7:48 [Patch] support of lock profiling in Xen Juergen Gross
@ 2009-10-08  8:15 ` Keir Fraser
  2009-10-08  8:27   ` Juergen Gross
  2009-10-08  8:21 ` Keir Fraser
  1 sibling, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  8:15 UTC (permalink / raw)
  To: Juergen Gross, xen-devel@lists.xensource.com

On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> I'm not completely satisfied with the solution for the dynamically initialized
> locks, but I had no better idea in the first run.
> Another enhancement would be to expand the profiling to rw-locks as well, but
> this would have required a rewrite of the lock routines using try_lock like
> for spinlocks. I could do this if the lock profiling is accepted.
> 
> Comments welcome :-)

The method of chaining and initialising the info is kind of icky. Requiring
users to unchain dynamic locks is just asking for this support to be
perpetually broken, or be used only for static locks.

How about defining new initialisers DEFINE_NAMED_SPINLOCK() and
named_spinlock_init(). This would indicate you consider a lock important
enough to name (and hence profile) and also categorise dynamically-allocated
locks, causing their stats to be aggregated (after all, lock optimisations
will have aggregate effect across all locks of that category).

If lock profiling is compiled in, have a static array of lock-profile
descriptors (keeps things simple - could make it a growable array, or
something). On lock init, walk the array looking for that name. If found,
write the entry index into a new field in the spinlock struct. If not found,
allocate next entry in lock-profile array, init it, and write its index into
the spinlock struct.

On lock operations, if the index field in the spinlock is non-zero, update
stats in the associated profile structure. Regarding races from locks
aliasing to the same profile structure -- either assume that doesn't matter
much, or perhaps update fields with atomic ops.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  7:48 [Patch] support of lock profiling in Xen Juergen Gross
  2009-10-08  8:15 ` Keir Fraser
@ 2009-10-08  8:21 ` Keir Fraser
  2009-10-08  8:38   ` Juergen Gross
  1 sibling, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  8:21 UTC (permalink / raw)
  To: Juergen Gross, xen-devel@lists.xensource.com

On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> Comments welcome :-)

There's also the question of whether this really gets you useful extra info
than can be digested from a xentrace run. I suppose at least the info is
easier to understand, and not many people know how to drive xentrace.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:15 ` Keir Fraser
@ 2009-10-08  8:27   ` Juergen Gross
  2009-10-08  8:38     ` Keir Fraser
  2009-10-08  8:40     ` Keir Fraser
  0 siblings, 2 replies; 17+ messages in thread
From: Juergen Gross @ 2009-10-08  8:27 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

Keir Fraser wrote:
> On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>> I'm not completely satisfied with the solution for the dynamically initialized
>> locks, but I had no better idea in the first run.
>> Another enhancement would be to expand the profiling to rw-locks as well, but
>> this would have required a rewrite of the lock routines using try_lock like
>> for spinlocks. I could do this if the lock profiling is accepted.
>>
>> Comments welcome :-)
> 
> The method of chaining and initialising the info is kind of icky. Requiring
> users to unchain dynamic locks is just asking for this support to be
> perpetually broken, or be used only for static locks.
> 
> How about defining new initialisers DEFINE_NAMED_SPINLOCK() and
> named_spinlock_init(). This would indicate you consider a lock important
> enough to name (and hence profile) and also categorise dynamically-allocated
> locks, causing their stats to be aggregated (after all, lock optimisations
> will have aggregate effect across all locks of that category).
> 
> If lock profiling is compiled in, have a static array of lock-profile
> descriptors (keeps things simple - could make it a growable array, or
> something). On lock init, walk the array looking for that name. If found,
> write the entry index into a new field in the spinlock struct. If not found,
> allocate next entry in lock-profile array, init it, and write its index into
> the spinlock struct.

Don't you think a solution like for performance counters would be better then?
The index would be in a header file and the sums could be held per-cpu to
avoid races.
It would even be possible to define lock arrays if summing up all data for
e.g. domain specific locks is not desired.

> 
> On lock operations, if the index field in the spinlock is non-zero, update
> stats in the associated profile structure. Regarding races from locks
> aliasing to the same profile structure -- either assume that doesn't matter
> much, or perhaps update fields with atomic ops.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:27   ` Juergen Gross
@ 2009-10-08  8:38     ` Keir Fraser
  2009-10-08  8:40     ` Keir Fraser
  1 sibling, 0 replies; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  8:38 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 08/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> If lock profiling is compiled in, have a static array of lock-profile
>> descriptors (keeps things simple - could make it a growable array, or
>> something). On lock init, walk the array looking for that name. If found,
>> write the entry index into a new field in the spinlock struct. If not found,
>> allocate next entry in lock-profile array, init it, and write its index into
>> the spinlock struct.
> 
> Don't you think a solution like for performance counters would be better then?
> The index would be in a header file and the sums could be held per-cpu to
> avoid races.
> It would even be possible to define lock arrays if summing up all data for
> e.g. domain specific locks is not desired.

It's just a question of impact on our existing locking API. I'd like to
minimise that. I'm not particularly a fan of the perfc stuff.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:21 ` Keir Fraser
@ 2009-10-08  8:38   ` Juergen Gross
  2009-10-08  9:17     ` Keir Fraser
  2009-10-09 10:35     ` George Dunlap
  0 siblings, 2 replies; 17+ messages in thread
From: Juergen Gross @ 2009-10-08  8:38 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

Keir Fraser wrote:
> On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>> Comments welcome :-)
> 
> There's also the question of whether this really gets you useful extra info
> than can be digested from a xentrace run. I suppose at least the info is
> easier to understand, and not many people know how to drive xentrace.

It helped me a lot for narrowing down my performance problem with a 8 vcpu
BS2000 system (this was the main reason I made the patch).
Finding such a problem in xentrace is quite a bit of work. xentrace produced
over 500 MB of data in 30 seconds, while reading the xenlockprof output
revealed the bottleneck in seconds.

And I think the lock profiling tool is much better for evaluating
optimizations in Xen regarding lock conflicts, if the differences are only
a few percent.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:27   ` Juergen Gross
  2009-10-08  8:38     ` Keir Fraser
@ 2009-10-08  8:40     ` Keir Fraser
  2009-10-08  9:04       ` Juergen Gross
  1 sibling, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  8:40 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 08/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> It would even be possible to define lock arrays if summing up all data for
> e.g. domain specific locks is not desired.

Allowing spinlock_init() to specify not only a name but also a
'category-specific index' might be okay. For example, domid, in the case of
a domain-specific lock. Hiding the allocation of profiling info structs in
the spinlock subsystem somehow... (e.g., radix tree to make a simple
growable array).

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:40     ` Keir Fraser
@ 2009-10-08  9:04       ` Juergen Gross
  2009-10-08  9:21         ` Keir Fraser
  0 siblings, 1 reply; 17+ messages in thread
From: Juergen Gross @ 2009-10-08  9:04 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

Keir Fraser wrote:
> On 08/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>> It would even be possible to define lock arrays if summing up all data for
>> e.g. domain specific locks is not desired.
> 
> Allowing spinlock_init() to specify not only a name but also a
> 'category-specific index' might be okay. For example, domid, in the case of
> a domain-specific lock. Hiding the allocation of profiling info structs in
> the spinlock subsystem somehow... (e.g., radix tree to make a simple
> growable array).

I'll try another patch...


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:38   ` Juergen Gross
@ 2009-10-08  9:17     ` Keir Fraser
  2009-10-09 10:35     ` George Dunlap
  1 sibling, 0 replies; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  9:17 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 08/10/2009 09:38, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> There's also the question of whether this really gets you useful extra info
>> than can be digested from a xentrace run. I suppose at least the info is
>> easier to understand, and not many people know how to drive xentrace.
> 
> It helped me a lot for narrowing down my performance problem with a 8 vcpu
> BS2000 system (this was the main reason I made the patch).
> Finding such a problem in xentrace is quite a bit of work. xentrace produced
> over 500 MB of data in 30 seconds, while reading the xenlockprof output
> revealed the bottleneck in seconds.
> 
> And I think the lock profiling tool is much better for evaluating
> optimizations in Xen regarding lock conflicts, if the differences are only
> a few percent.

Okay, well the only thing I'm dead against in a lockmeter patch is requring
dynamic registration and unregistration of individual locks. Largely because
I see a never-ending stream of fix this unregistration, fix that
unregistration, stretching out before me, because callers will forever be
getting broken.

I suppose most locks are either global or per-domain, so perhaps you could
just have domain create/destroy hooks into lockmeter, to alloc/free an array
of per-domain locks? Would that actually be an argument for a perfc-style
central declaration of all profileable locks, so you could indicate which
are global and which are per-domain?

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  9:04       ` Juergen Gross
@ 2009-10-08  9:21         ` Keir Fraser
  2009-10-08  9:35           ` Juergen Gross
  0 siblings, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  9:21 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 08/10/2009 10:04, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>>> It would even be possible to define lock arrays if summing up all data for
>>> e.g. domain specific locks is not desired.
>> 
>> Allowing spinlock_init() to specify not only a name but also a
>> 'category-specific index' might be okay. For example, domid, in the case of
>> a domain-specific lock. Hiding the allocation of profiling info structs in
>> the spinlock subsystem somehow... (e.g., radix tree to make a simple
>> growable array).
> 
> I'll try another patch...

Well, see my email just now. I could perhaps live with perfc-style
declarations after all... Because this might make sense if locks can be
broadly categorised as global or as per-domain, and it could avoid you
having to make your implementation much more complex?

After all, now I think about it, an 'arbitrary' index specified to
spinlock_init() is just making our lives more complicated if it will
basically *always* be a domid. And knowing at a higher level which locks are
actually per-domain could help your pretty printing?

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  9:21         ` Keir Fraser
@ 2009-10-08  9:35           ` Juergen Gross
  2009-10-08  9:47             ` Keir Fraser
  0 siblings, 1 reply; 17+ messages in thread
From: Juergen Gross @ 2009-10-08  9:35 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

Keir Fraser wrote:
> On 08/10/2009 10:04, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>>>> It would even be possible to define lock arrays if summing up all data for
>>>> e.g. domain specific locks is not desired.
>>> Allowing spinlock_init() to specify not only a name but also a
>>> 'category-specific index' might be okay. For example, domid, in the case of
>>> a domain-specific lock. Hiding the allocation of profiling info structs in
>>> the spinlock subsystem somehow... (e.g., radix tree to make a simple
>>> growable array).
>> I'll try another patch...
> 
> Well, see my email just now. I could perhaps live with perfc-style
> declarations after all... Because this might make sense if locks can be
> broadly categorised as global or as per-domain, and it could avoid you
> having to make your implementation much more complex?
> 
> After all, now I think about it, an 'arbitrary' index specified to
> spinlock_init() is just making our lives more complicated if it will
> basically *always* be a domid. And knowing at a higher level which locks are
> actually per-domain could help your pretty printing?

I thought in this direction, too.
I would, however, try not to limit it to per domain locks, even if the first
implementation would support only global locks and per domain locks.

The perfc style declarations should only be necessary for the dynamically
initialized locks, and those could live near the domain structure definition
(or any other structure where they are needed).

I think the statically defined locks can be handled via an own data section
requiring no centrally defined index.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  9:35           ` Juergen Gross
@ 2009-10-08  9:47             ` Keir Fraser
  2009-10-09  8:27               ` Juergen Gross
  0 siblings, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2009-10-08  9:47 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 08/10/2009 10:35, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> After all, now I think about it, an 'arbitrary' index specified to
>> spinlock_init() is just making our lives more complicated if it will
>> basically *always* be a domid. And knowing at a higher level which locks are
>> actually per-domain could help your pretty printing?
> 
> I thought in this direction, too.
> I would, however, try not to limit it to per domain locks, even if the first
> implementation would support only global locks and per domain locks.
> 
> The perfc style declarations should only be necessary for the dynamically
> initialized locks, and those could live near the domain structure definition
> (or any other structure where they are needed).
> 
> I think the statically defined locks can be handled via an own data section
> requiring no centrally defined index.

Okay, well this all sounds an acceptable kind of direction. Worth spinning
another patch.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  9:47             ` Keir Fraser
@ 2009-10-09  8:27               ` Juergen Gross
  2009-10-09  9:22                 ` Keir Fraser
  0 siblings, 1 reply; 17+ messages in thread
From: Juergen Gross @ 2009-10-09  8:27 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

[-- Attachment #1: Type: text/plain, Size: 1686 bytes --]

Keir Fraser wrote:
> On 08/10/2009 10:35, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>>> After all, now I think about it, an 'arbitrary' index specified to
>>> spinlock_init() is just making our lives more complicated if it will
>>> basically *always* be a domid. And knowing at a higher level which locks are
>>> actually per-domain could help your pretty printing?
>> I thought in this direction, too.
>> I would, however, try not to limit it to per domain locks, even if the first
>> implementation would support only global locks and per domain locks.
>>
>> The perfc style declarations should only be necessary for the dynamically
>> initialized locks, and those could live near the domain structure definition
>> (or any other structure where they are needed).
>>
>> I think the statically defined locks can be handled via an own data section
>> requiring no centrally defined index.
> 
> Okay, well this all sounds an acceptable kind of direction. Worth spinning
> another patch.

Here it is.
I managed to avoid the perfc-like stuff completely.

Included are 2 global locks defined for profiling and 2 locks in the domain
structure.

I've added some comments in spinlock.h how to use the profiling in structures.

Tested by compilation and boot of Dom0 with and without profiling enabled.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

[-- Attachment #2: lock_prof.patch --]
[-- Type: text/x-patch, Size: 29333 bytes --]

Support of lock profiling

Signed-off-by: juergen.gross@ts.fujitsu.com

diff -r 729567f615c1 tools/libxc/xc_misc.c
--- a/tools/libxc/xc_misc.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/tools/libxc/xc_misc.c	Fri Oct 09 09:35:45 2009 +0200
@@ -121,6 +121,30 @@
     return rc;
 }
 
+int xc_lockprof_control(int xc_handle,
+                        uint32_t opcode,
+                        uint32_t *n_elems,
+                        uint64_t *time,
+                        xc_lockprof_data_t *data)
+{
+    int rc;
+    DECLARE_SYSCTL;
+
+    sysctl.cmd = XEN_SYSCTL_lockprof_op;
+    sysctl.u.lockprof_op.cmd = opcode;
+    sysctl.u.lockprof_op.max_elem = n_elems ? *n_elems : 0;
+    set_xen_guest_handle(sysctl.u.lockprof_op.data, data);
+
+    rc = do_sysctl(xc_handle, &sysctl);
+
+    if (n_elems)
+        *n_elems = sysctl.u.lockprof_op.nr_elem;
+    if (time)
+        *time = sysctl.u.lockprof_op.time;
+
+    return rc;
+}
+
 int xc_getcpuinfo(int xc_handle, int max_cpus,
                   xc_cpuinfo_t *info, int *nr_cpus)
 {
diff -r 729567f615c1 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/tools/libxc/xenctrl.h	Fri Oct 09 09:35:45 2009 +0200
@@ -700,6 +700,14 @@
                      int *nbr_desc,
                      int *nbr_val);
 
+typedef xen_sysctl_lockprof_data_t xc_lockprof_data_t;
+/* IMPORTANT: The caller is responsible for mlock()'ing the @data array. */
+int xc_lockprof_control(int xc_handle,
+                        uint32_t opcode,
+                        uint32_t *n_elems,
+                        uint64_t *time,
+                        xc_lockprof_data_t *data);
+
 /**
  * Memory maps a range within one domain to a local address range.  Mappings
  * should be unmapped with munmap and should follow the same rules as mmap
diff -r 729567f615c1 tools/misc/Makefile
--- a/tools/misc/Makefile	Wed Oct 07 16:29:03 2009 +0100
+++ b/tools/misc/Makefile	Fri Oct 09 09:35:45 2009 +0200
@@ -10,7 +10,7 @@
 
 HDRS     = $(wildcard *.h)
 
-TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat
+TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof
 TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@
 INSTALL_BIN-$(CONFIG_X86) += xen-detect
 INSTALL_BIN := $(INSTALL_BIN-y)
 
-INSTALL_SBIN-y := xm xen-bugtool xen-python-path xend xenperf xsview xenpm xen-tmem-list-parse gtraceview gtracestat
+INSTALL_SBIN-y := xm xen-bugtool xen-python-path xend xenperf xsview xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof
 INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -49,7 +49,7 @@
 %.o: %.c $(HDRS) Makefile
 	$(CC) -c $(CFLAGS) -o $@ $<
 
-xen-hvmctx xenperf xenpm gtracestat: %: %.o Makefile
+xen-hvmctx xenperf xenpm gtracestat xenlockprof: %: %.o Makefile
 	$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) $(LDFLAGS_libxenctrl)
 
 gtraceview: %: %.o Makefile
diff -r 729567f615c1 tools/misc/xenlockprof.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/misc/xenlockprof.c	Fri Oct 09 09:35:45 2009 +0200
@@ -0,0 +1,136 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ ****************************************************************************
+ * (C) 2009 - Juergen Gross - Fujitsu Technology Solutions
+ ****************************************************************************
+ *
+ *        File: xenlockprof.c
+ *      Author: Juergen Gross (juergen.gross@ts.fujitsu.com)
+ *        Date: Oct 2009
+ * 
+ * Description: 
+ */
+
+#include <xenctrl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static int lock_pages(void *addr, size_t len)
+{
+    int e = 0;
+#ifndef __sun__
+    e = mlock(addr, len);
+#endif
+    return (e);
+}
+
+static void unlock_pages(void *addr, size_t len)
+{
+#ifndef __sun__
+    munlock(addr, len);
+#endif
+}
+
+int main(int argc, char *argv[])
+{
+    int                xc_handle;
+    uint32_t           i, j, n;
+    uint64_t           time;
+    double             l, b, sl, sb;
+    char               name[60];
+    xc_lockprof_data_t *data;
+
+    if ((argc > 2) || ((argc == 2) && (strcmp(argv[1], "-r") != 0)))
+    {
+        printf("%s: [-r]\n", argv[0]);
+        printf("no args: print lock profile data\n");
+        printf("    -r : reset profile data\n");
+        return 1;
+    }
+
+    if ((xc_handle = xc_interface_open()) == -1)
+    {
+        fprintf(stderr, "Error opening xc interface: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    if (argc > 1)
+    {
+        if (xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_reset, NULL,
+                                NULL, NULL) != 0)
+        {
+            fprintf(stderr, "Error reseting profile data: %d (%s)\n",
+                    errno, strerror(errno));
+            return 1;
+        }
+        return 1;
+    }
+
+    n = 0;
+    if (xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_query, &n,
+                            NULL, NULL) != 0)
+    {
+        fprintf(stderr, "Error getting number of profile records: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    n += 32;    /* just to be sure */
+    data = malloc(sizeof(*data) * n);
+    if ((data == NULL) || (lock_pages(data, sizeof(*data) * n) != 0))
+    {
+        fprintf(stderr, "Could not alloc or lock buffers: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    i = n;
+    if ( xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_query, &i,
+                             &time, data) != 0)
+    {
+        fprintf(stderr, "Error getting profile records: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    unlock_pages(data, sizeof(*data) * n);
+
+    if (i > n)
+    {
+        printf("data incomplete, %d records are missing!\n\n", i - n);
+        i = n;
+    }
+    sl = 0;
+    sb = 0;
+    for (j = 0; j < i; j++)
+    {
+        switch (data[j].type)
+        {
+        case LOCKPROF_TYPE_GLOBAL:
+            sprintf(name, "global lock %s", data[j].name);
+            break;
+        case LOCKPROF_TYPE_PERDOM:
+            sprintf(name, "domain %d lock %s", data[j].idx, data[j].name);
+            break;
+        default:
+            sprintf(name, "unknown type(%d) %d lock %s", data[j].type,
+                data[j].idx, data[j].name);
+            break;
+        }
+        l = (double)(data[j].lock_time) / 1E+09;
+        b = (double)(data[j].block_time) / 1E+09;
+        sl += l;
+        sb += b;
+        printf("%-50s: lock:%12ld(%20.9fs), block:%12ld(%20.9fs)\n",
+            name, data[j].lock_cnt, l, data[j].block_cnt, b);
+    }
+    l = (double)time / 1E+09;
+    printf("total profiling time: %20.9fs\n", l);
+    printf("total locked time:    %20.9fs\n", sl);
+    printf("total blocked time:   %20.9fs\n", sb);
+
+    return 0;
+}
diff -r 729567f615c1 xen/Rules.mk
--- a/xen/Rules.mk	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/Rules.mk	Fri Oct 09 09:35:45 2009 +0200
@@ -6,6 +6,7 @@
 verbose       ?= n
 perfc         ?= n
 perfc_arrays  ?= n
+lock_profile  ?= n
 crash_debug   ?= n
 frame_pointer ?= n
 
@@ -49,6 +50,7 @@
 CFLAGS-$(crash_debug)   += -DCRASH_DEBUG
 CFLAGS-$(perfc)         += -DPERF_COUNTERS
 CFLAGS-$(perfc_arrays)  += -DPERF_ARRAYS
+CFLAGS-$(lock_profile)  += -DLOCK_PROFILE
 CFLAGS-$(frame_pointer) += -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER
 
 ifneq ($(max_phys_cpus),)
diff -r 729567f615c1 xen/arch/x86/pci.c
--- a/xen/arch/x86/pci.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/arch/x86/pci.c	Fri Oct 09 09:35:45 2009 +0200
@@ -8,7 +8,7 @@
 #include <xen/pci.h>
 #include <asm/io.h>
 
-static DEFINE_SPINLOCK(pci_config_lock);
+DEFINE_SPINLOCK_STATIC_PROF(pci_config_lock);
 
 uint32_t pci_conf_read(uint32_t cf8, uint8_t offset, uint8_t bytes)
 {
diff -r 729567f615c1 xen/arch/x86/smp.c
--- a/xen/arch/x86/smp.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/arch/x86/smp.c	Fri Oct 09 09:35:45 2009 +0200
@@ -201,7 +201,7 @@
     local_irq_restore(flags);
 }
 
-static DEFINE_SPINLOCK(flush_lock);
+DEFINE_SPINLOCK_STATIC_PROF(flush_lock);
 static cpumask_t flush_cpumask;
 static const void *flush_va;
 static unsigned int flush_flags;
diff -r 729567f615c1 xen/arch/x86/xen.lds.S
--- a/xen/arch/x86/xen.lds.S	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/arch/x86/xen.lds.S	Fri Oct 09 09:35:45 2009 +0200
@@ -62,6 +62,13 @@
        *(.data.read_mostly)
   } :text
 
+#ifdef LOCK_PROFILE
+  . = ALIGN(32);
+  __lock_profile_start = .;
+  .lockprofile.data : { *(.lockprofile.data) } :text
+  __lock_profile_end = .;
+#endif
+
   . = ALIGN(4096);             /* Init code and data */
   __init_begin = .;
   .init.text : {
diff -r 729567f615c1 xen/common/domain.c
--- a/xen/common/domain.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/domain.c	Fri Oct 09 09:35:45 2009 +0200
@@ -226,8 +226,8 @@
     init_status |= INIT_xsm;
 
     atomic_set(&d->refcnt, 1);
-    spin_lock_init(&d->domain_lock);
-    spin_lock_init(&d->page_alloc_lock);
+    spin_lock_init_prof(d, domain_lock);
+    spin_lock_init_prof(d, page_alloc_lock);
     spin_lock_init(&d->shutdown_lock);
     spin_lock_init(&d->hypercall_deadlock_mutex);
     INIT_PAGE_LIST_HEAD(&d->page_list);
@@ -884,6 +884,29 @@
     return -ENOSYS;
 }
 
+#ifdef LOCK_PROFILE
+void domain_lock_prof_func(lock_profile_subfunc *sub, void *par)
+{
+    struct domain *d;
+
+    rcu_read_lock(&domlist_read_lock);
+    for_each_domain(d)
+    {
+        sub((struct lock_profile_data_array *)(&d->lock_profile_array),
+	    LOCKPROF_TYPE_PERDOM, "domain", d->domain_id, par);
+    }
+    rcu_read_unlock(&domlist_read_lock);
+    return;
+}
+
+static int __init domain_lock_prof_init(void)
+{
+    lock_profile_funcs[LOCKPROF_TYPE_PERDOM] = domain_lock_prof_func;
+    return 0;
+}
+__initcall(domain_lock_prof_init);
+#endif
+
 /*
  * Local variables:
  * mode: C
diff -r 729567f615c1 xen/common/keyhandler.c
--- a/xen/common/keyhandler.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/keyhandler.c	Fri Oct 09 09:35:45 2009 +0200
@@ -347,6 +347,20 @@
 };
 #endif
 
+#ifdef LOCK_PROFILE
+extern void spinlock_profile_printall(unsigned char key);
+static struct keyhandler spinlock_printall_keyhandler = {
+    .diagnostic = 1,
+    .u.fn = spinlock_profile_printall,
+    .desc = "print lock profile info"
+};
+extern void spinlock_profile_reset(unsigned char key);
+static struct keyhandler spinlock_reset_keyhandler = {
+    .u.fn = spinlock_profile_reset,
+    .desc = "reset lock profile info"
+};
+#endif
+
 static void run_all_nonirq_keyhandlers(unsigned long unused)
 {
     /* Fire all the non-IRQ-context diagnostic keyhandlers */
@@ -428,6 +442,12 @@
     register_keyhandler('p', &perfc_printall_keyhandler);
     register_keyhandler('P', &perfc_reset_keyhandler);
 #endif
+
+#ifdef LOCK_PROFILE
+    register_keyhandler('l', &spinlock_printall_keyhandler);
+    register_keyhandler('L', &spinlock_reset_keyhandler);
+#endif
+
 }
 
 /*
diff -r 729567f615c1 xen/common/spinlock.c
--- a/xen/common/spinlock.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/spinlock.c	Fri Oct 09 09:35:45 2009 +0200
@@ -1,7 +1,11 @@
+#include <xen/lib.h>
 #include <xen/config.h>
 #include <xen/irq.h>
 #include <xen/smp.h>
+#include <xen/time.h>
 #include <xen/spinlock.h>
+#include <xen/guest_access.h>
+#include <public/sysctl.h>
 #include <asm/processor.h>
 
 #ifndef NDEBUG
@@ -41,56 +45,104 @@
 
 #endif
 
+#ifdef LOCK_PROFILE
+
+#define LOCK_PROFILE_REL                                                    \
+    if (lock->profile.data)                                                 \
+    {                                                                       \
+        lock->profile.data->time_hold += NOW() - lock->profile.time_locked; \
+        lock->profile.data->lock_cnt++;                                     \
+    }
+#define LOCK_PROFILE_VAR    s_time_t block = 0
+#define LOCK_PROFILE_BLOCK  if (lock->profile.data) block = block ? : NOW();
+#define LOCK_PROFILE_GOT                                              \
+    if (lock->profile.data)                                           \
+    {                                                                 \
+        lock->profile.time_locked = NOW();                            \
+        if (block)                                                    \
+        {                                                             \
+            lock->profile.data->time_block +=                         \
+                lock->profile.time_locked - block;                    \
+            lock->profile.data->block_cnt++;                          \
+        }                                                             \
+    }
+
+#else
+
+#define LOCK_PROFILE_REL
+#define LOCK_PROFILE_VAR
+#define LOCK_PROFILE_BLOCK
+#define LOCK_PROFILE_GOT
+
+#endif
+
 void _spin_lock(spinlock_t *lock)
 {
+    LOCK_PROFILE_VAR;
+
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
+    {
+        LOCK_PROFILE_BLOCK;
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
+    }
+    LOCK_PROFILE_GOT;
 }
 
 void _spin_lock_irq(spinlock_t *lock)
 {
+    LOCK_PROFILE_VAR;
+
     ASSERT(local_irq_is_enabled());
     local_irq_disable();
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
     {
+        LOCK_PROFILE_BLOCK;
         local_irq_enable();
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
         local_irq_disable();
     }
+    LOCK_PROFILE_GOT;
 }
 
 unsigned long _spin_lock_irqsave(spinlock_t *lock)
 {
     unsigned long flags;
+    LOCK_PROFILE_VAR;
+
     local_irq_save(flags);
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
     {
+        LOCK_PROFILE_BLOCK;
         local_irq_restore(flags);
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
         local_irq_save(flags);
     }
+    LOCK_PROFILE_GOT;
     return flags;
 }
 
 void _spin_unlock(spinlock_t *lock)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
 }
 
 void _spin_unlock_irq(spinlock_t *lock)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
     local_irq_enable();
 }
 
 void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
     local_irq_restore(flags);
 }
@@ -104,13 +156,35 @@
 int _spin_trylock(spinlock_t *lock)
 {
     check_lock(&lock->debug);
+#ifndef LOCK_PROFILE
     return _raw_spin_trylock(&lock->raw);
+#else
+    if (!_raw_spin_trylock(&lock->raw)) return 0;
+    if (lock->profile.data)
+    {
+        lock->profile.time_locked = NOW();
+    }
+    return 1;
+#endif
 }
 
 void _spin_barrier(spinlock_t *lock)
 {
+#ifdef LOCK_PROFILE
+    s_time_t block = NOW();
+    u64      loop = 0;
+
+    check_lock(&lock->debug);
+    do { mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
+    if ((loop > 1) && lock->profile.data)
+    {
+        lock->profile.data->time_block += NOW() - block;
+        lock->profile.data->block_cnt++;
+    }
+#else
     check_lock(&lock->debug);
     do { mb(); } while ( _raw_spin_is_locked(&lock->raw) );
+#endif
     mb();
 }
 
@@ -248,3 +322,189 @@
     check_lock(&lock->debug);
     return _raw_rw_is_write_locked(&lock->raw);
 }
+
+#ifdef LOCK_PROFILE
+extern struct lock_profile_data __lock_profile_start;
+extern struct lock_profile_data __lock_profile_end;
+
+static s_time_t lock_profile_start = 0;
+
+lock_profile_func *lock_profile_funcs[LOCKPROF_TYPE_N];
+
+void spinlock_profile_iterate(lock_profile_subfunc *sub, void *par)
+{
+    int  i;
+
+    for (i = 0; i < LOCKPROF_TYPE_N; i++)
+    {
+        if (lock_profile_funcs[i])
+	    lock_profile_funcs[i](sub, par);
+    }
+    return;
+}
+
+void spinlock_profile_print_array(struct lock_profile_data_array *array,
+    uint32_t type, char *txt, uint32_t idx, void *par)
+{
+    int  i;
+    struct lock_profile_data *q;
+
+    for (i = 0; i < array->cnt; i++)
+    {
+        q = array->array + i;
+        if (q->name == NULL)
+            continue;
+        printk("%s %d %s:\n", txt, idx, q->name);
+        printk("  lock:%12ld(%08X:%08X), block:%12ld(%08X:%08X)\n",
+            q->lock_cnt, (u32)(q->time_hold >> 32), (u32)q->time_hold,
+            q->block_cnt, (u32)(q->time_block >> 32), (u32)q->time_block);
+    }
+    return;
+}
+
+void spinlock_profile_printall(unsigned char key)
+{
+    s_time_t now = NOW();
+    s_time_t diff;
+    struct lock_profile_data *q;
+
+    diff = now - lock_profile_start;
+    printk("Xen lock profile info SHOW  (now = %08X:%08X, "
+        "total = %08X:%08X)\n", (u32)(now>>32), (u32)now,
+        (u32)(diff>>32), (u32)diff);
+    for (q = &__lock_profile_start; q < &__lock_profile_end; q++)
+    {
+        printk("%s:\n", q->name);
+        printk("  lock:%12ld(%08X:%08X), block:%12ld(%08X:%08X)\n",
+            q->lock_cnt, (u32)(q->time_hold >> 32), (u32)q->time_hold,
+            q->block_cnt, (u32)(q->time_block >> 32), (u32)q->time_block);
+    }
+    spinlock_profile_iterate(spinlock_profile_print_array, NULL);
+    return;
+}
+
+void spinlock_profile_reset_array(struct lock_profile_data_array *array,
+    uint32_t type, char *txt, uint32_t idx, void *par)
+{
+    int  i;
+
+    for (i = 0; i < array->cnt; i++)
+    {
+        array->array[i].lock_cnt = 0;
+        array->array[i].block_cnt = 0;
+        array->array[i].time_hold = 0;
+        array->array[i].time_block = 0;
+    }
+    return;
+}
+
+void spinlock_profile_reset(unsigned char key)
+{
+    s_time_t now = NOW();
+    struct lock_profile_data *q;
+
+    if ( key != '\0' )
+        printk("Xen lock profile info RESET (now = %08X:%08X)\n",
+            (u32)(now>>32), (u32)now);
+    lock_profile_start = now;
+    for (q = &__lock_profile_start; q < &__lock_profile_end; q++)
+    {
+        q->lock_cnt = 0;
+        q->block_cnt = 0;
+        q->time_hold = 0;
+        q->time_block = 0;
+    }
+    spinlock_profile_iterate(spinlock_profile_reset_array, NULL);
+    return;
+}
+
+typedef struct {
+    xen_sysctl_lockprof_op_t *pc;
+    int                      rc;
+} spinlock_profile_ucopy_t;
+
+void spinlock_profile_ucopy_array(struct lock_profile_data_array *array,
+    uint32_t type, char *txt, uint32_t idx, void *par)
+{
+    int  i;
+    struct lock_profile_data *q;
+    spinlock_profile_ucopy_t *p;
+    xen_sysctl_lockprof_data_t elem;
+
+    p = (spinlock_profile_ucopy_t *)par;
+    if (p->rc)
+        return;
+
+    for (i = 0; i < array->cnt; i++)
+    {
+        q = array->array + i;
+        if (q->name == NULL)
+            continue;
+        if (p->pc->nr_elem < p->pc->max_elem)
+        {
+            safe_strcpy(elem.name, q->name);
+            elem.type = type;
+            elem.idx = idx;
+            elem.lock_cnt = q->lock_cnt;
+            elem.block_cnt = q->block_cnt;
+            elem.lock_time = q->time_hold;
+            elem.block_time = q->time_block;
+            if (copy_to_guest_offset(p->pc->data, p->pc->nr_elem, &elem, 1))
+            {
+                p->rc = -EFAULT;
+		return;
+            }
+        }
+        p->pc->nr_elem++;
+    }
+    return;
+}
+
+/* Dom0 control of lock profiling */
+int spinlock_profile_control(xen_sysctl_lockprof_op_t *pc)
+{
+    int rc;
+    spinlock_profile_ucopy_t par;
+    struct lock_profile_data *q;
+    xen_sysctl_lockprof_data_t elem;
+
+    rc = 0;
+    switch (pc->cmd)
+    {
+    case XEN_SYSCTL_LOCKPROF_reset:
+        spinlock_profile_reset('\0');
+        break;
+    case XEN_SYSCTL_LOCKPROF_query:
+        q = &__lock_profile_start;
+	pc->nr_elem = 0;
+	par.rc = 0;
+	par.pc = pc;
+        for (q = &__lock_profile_start; q < &__lock_profile_end; q++)
+        {
+            if (pc->nr_elem < pc->max_elem)
+            {
+                safe_strcpy(elem.name, q->name);
+                elem.type = LOCKPROF_TYPE_GLOBAL;
+                elem.idx = 0;
+                elem.lock_cnt = q->lock_cnt;
+                elem.block_cnt = q->block_cnt;
+                elem.lock_time = q->time_hold;
+                elem.block_time = q->time_block;
+                if (copy_to_guest_offset(pc->data, pc->nr_elem, &elem, 1))
+                {
+                    return -EFAULT;
+                }
+            }
+            pc->nr_elem++;
+        }
+        spinlock_profile_iterate(spinlock_profile_ucopy_array, &par);
+        pc->time = NOW() - lock_profile_start;
+	rc = par.rc;
+        break;
+    default:
+        rc = -EINVAL;
+        break;
+    }
+    return rc;
+}
+#endif
diff -r 729567f615c1 xen/common/sysctl.c
--- a/xen/common/sysctl.c	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/common/sysctl.c	Fri Oct 09 09:35:45 2009 +0200
@@ -29,6 +29,9 @@
 
 extern long arch_do_sysctl(
     struct xen_sysctl *op, XEN_GUEST_HANDLE(xen_sysctl_t) u_sysctl);
+#ifdef LOCK_PROFILE
+extern int spinlock_profile_control(xen_sysctl_lockprof_op_t *pc);
+#endif
 
 long do_sysctl(XEN_GUEST_HANDLE(xen_sysctl_t) u_sysctl)
 {
@@ -144,6 +147,15 @@
     break;
 #endif
 
+#ifdef LOCK_PROFILE
+    case XEN_SYSCTL_lockprof_op:
+    {
+        ret = spinlock_profile_control(&op->u.lockprof_op);
+        if ( copy_to_guest(u_sysctl, op, 1) )
+            ret = -EFAULT;
+    }
+    break;
+#endif
     case XEN_SYSCTL_debug_keys:
     {
         char c;
diff -r 729567f615c1 xen/include/public/sysctl.h
--- a/xen/include/public/sysctl.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/include/public/sysctl.h	Fri Oct 09 09:35:45 2009 +0200
@@ -454,6 +454,38 @@
 
 #define PG_OFFLINE_OWNER_SHIFT 16
 
+#define XEN_SYSCTL_lockprof_op       11
+/* Sub-operations: */
+#define XEN_SYSCTL_LOCKPROF_reset 1   /* Reset all profile data to zero. */
+#define XEN_SYSCTL_LOCKPROF_query 2   /* Get lock profile information. */
+/* Record-type: */
+#define LOCKPROF_TYPE_GLOBAL      0   /* global lock, idx meaningless */
+#define LOCKPROF_TYPE_PERDOM      1   /* per-domain lock, idx is domid */
+#define LOCKPROF_TYPE_N           2   /* number of types */
+struct xen_sysctl_lockprof_data {
+    char     name[40];     /* lock name (may include up to 2 %d specifiers) */
+    int32_t  type;         /* LOCKPROF_TYPE_??? */
+    int32_t  idx;          /* index (e.g. domain id) */
+    uint64_t lock_cnt;     /* # of locking succeeded */
+    uint64_t block_cnt;    /* # of wait for lock */
+    uint64_t lock_time;    /* nsecs lock held */
+    uint64_t block_time;   /* nsecs waited for lock */
+};
+typedef struct xen_sysctl_lockprof_data xen_sysctl_lockprof_data_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_data_t);
+struct xen_sysctl_lockprof_op {
+    /* IN variables. */
+    uint32_t       cmd;               /* XEN_SYSCTL_LOCKPROF_??? */
+    uint32_t       max_elem;          /* size of output buffer */
+    /* OUT variables (query only). */
+    uint32_t       nr_elem;           /* number of elements available */
+    uint64_t       time;              /* nsecs of profile measurement */
+    /* profile information (or NULL) */
+    XEN_GUEST_HANDLE_64(xen_sysctl_lockprof_data_t) data;
+};
+typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
+
 struct xen_sysctl {
     uint32_t cmd;
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
@@ -471,6 +503,7 @@
         struct xen_sysctl_cpu_hotplug       cpu_hotplug;
         struct xen_sysctl_pm_op             pm_op;
         struct xen_sysctl_page_offline_op   page_offline;
+        struct xen_sysctl_lockprof_op       lockprof_op;
         uint8_t                             pad[128];
     } u;
 };
diff -r 729567f615c1 xen/include/xen/sched.h
--- a/xen/include/xen/sched.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/include/xen/sched.h	Fri Oct 09 09:35:45 2009 +0200
@@ -276,6 +276,8 @@
 
     /* transcendent memory, auto-allocated on first tmem op by each domain */
     void *tmem;
+
+    SPIN_LOCK_PROF_ARRAY(5);
 };
 
 struct domain_setup_info
diff -r 729567f615c1 xen/include/xen/spinlock.h
--- a/xen/include/xen/spinlock.h	Wed Oct 07 16:29:03 2009 +0100
+++ b/xen/include/xen/spinlock.h	Fri Oct 09 09:35:45 2009 +0200
@@ -19,17 +19,119 @@
 #define spin_debug_disable() ((void)0)
 #endif
 
+#ifdef LOCK_PROFILE
+/*
+    lock profiling on:
+
+    Global locks which sould be subject to profiling must be declared via
+    DEFINE_SPINLOCK_PROF or DEFINE_SPINLOCK_STATIC_PROF.
+
+    For locks in structures further measures are necessary:
+    The structure must include a profiling data section declared via
+    SPIN_LOCK_PROF_ARRAY(n)
+    with n being the maximum number of profiled locks in the structure.
+    Each lock in the structure subject to profiling has to be initialized via
+    spin_lock_init_prof(s, l)
+    with s being the structure pointer and l the lock field.
+    To print the profile data an iterator function must be registered (only if
+    LOCK_PROFILE is defined):
+
+    void func (lock_profile_subfunc *sub, void *par)
+    {
+        struct xyz *x;
+
+        for_each_xyz(x)
+        {
+            sub((struct lock_profile_data_array *)(&x->lock_profile_array),
+                LOCKPROF_TYPE_XYZ, "xyz", x->index, par);
+        }
+        return;
+    }
+
+    static int __init xyz_init(void)
+    {
+        lock_profile_funcs[LOCKPROF_TYPE_XYZ] = func;
+        return 0;
+    }
+    __initcall(xyz_init);
+*/
+
+struct lock_profile_data {
+    char *name;                  /* lock name, may contain %d for idx */
+    u64  lock_cnt;               /* # of complete locking ops */
+    u64  block_cnt;              /* # of complete wait for lock */
+    s64  time_hold;              /* cumulated lock time */
+    s64  time_block;             /* cumulated wait time */
+};
+
+struct lock_profile {
+    struct lock_profile_data *data;        /* profiling data */
+    s64                      time_locked;  /* system time of last locking */
+};
+
+struct lock_profile_data_array {
+    int                      cnt;          /* # of populated entries */
+    struct lock_profile_data array[100];   /* profile data */
+};
+
+typedef void lock_profile_subfunc(struct lock_profile_data_array *,
+                                  uint32_t, char *, uint32_t, void *);
+typedef void lock_profile_func(lock_profile_subfunc *, void *);
+
+extern lock_profile_func *lock_profile_funcs[];
+
+#define _LOCK_PROFILE(name) { &name, 0 }
+#define _LOCK_NO_PROFILE { NULL, 0 }
+#define _LOCK_PROFILE_DATA(name)                                            \
+    struct lock_profile_data __lock_profile_##name                          \
+    __attribute__ ((__section__(".lockprofile.data"))) = { #name, 0, 0, 0, 0 };
+#define SPIN_LOCK_PROF_ARRAY(n)                                             \
+    struct {                                                                \
+        int                      cnt;                                       \
+        struct lock_profile_data data[n];                                   \
+    } lock_profile_array
+#define _spin_lock_init_prof(s, l)  ;                                       \
+    BUG_ON((s)->lock_profile_array.cnt >=                                   \
+        (sizeof((s)->lock_profile_array.data) /                             \
+        sizeof(struct lock_profile_data)));                                 \
+    (s)->l.profile.data =                                                   \
+        (s)->lock_profile_array.data + (s)->lock_profile_array.cnt;         \
+    (s)->lock_profile_array.cnt++;                                          \
+    (s)->l.profile.data->name = #l
+
+#else
+struct lock_profile { };
+#define _LOCK_PROFILE(name) { }
+#define _LOCK_NO_PROFILE { }
+#define _LOCK_PROFILE_DATA(name)
+#define SPIN_LOCK_PROF_ARRAY(n) struct { } lock_profile_array
+#define _spin_lock_init_prof(s, l)
+#endif
+
 typedef struct {
     raw_spinlock_t raw;
     u16 recurse_cpu:12;
     u16 recurse_cnt:4;
     struct lock_debug debug;
+    struct lock_profile profile;
 } spinlock_t;
 
 
-#define SPIN_LOCK_UNLOCKED { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG }
+#define _SPIN_LOCK_UNLOCKED(x) { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0,          \
+                                 _LOCK_DEBUG, x }
+#define SPIN_LOCK_UNLOCKED _SPIN_LOCK_UNLOCKED(_LOCK_NO_PROFILE)
+#define SPIN_LOCK_UNLOCKED_PROF(n)                                            \
+        _SPIN_LOCK_UNLOCKED(_LOCK_PROFILE(__lock_profile_##n))
 #define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED
+#define DEFINE_SPINLOCK_PROF(l) _LOCK_PROFILE_DATA(l)                         \
+                           spinlock_t l = SPIN_LOCK_UNLOCKED_PROF(l)
+#define DEFINE_SPINLOCK_STATIC(l) static spinlock_t l = SPIN_LOCK_UNLOCKED
+#define DEFINE_SPINLOCK_STATIC_PROF(l) _LOCK_PROFILE_DATA(l)                  \
+                           static spinlock_t l = SPIN_LOCK_UNLOCKED_PROF(l)
 #define spin_lock_init(l) (*(l) = (spinlock_t)SPIN_LOCK_UNLOCKED)
+#define spin_lock_init_prof(s, l)                                             \
+    spin_lock_init(&(s)->l)                                                   \
+    _spin_lock_init_prof(s, l)
 
 typedef struct {
     raw_rwlock_t raw;

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-09  8:27               ` Juergen Gross
@ 2009-10-09  9:22                 ` Keir Fraser
  2009-10-13 13:56                   ` Juergen Gross
  0 siblings, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2009-10-09  9:22 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 09/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> Okay, well this all sounds an acceptable kind of direction. Worth spinning
>> another patch.
> 
> Here it is.
> I managed to avoid the perfc-like stuff completely.
> 
> Included are 2 global locks defined for profiling and 2 locks in the domain
> structure.
> 
> I've added some comments in spinlock.h how to use the profiling in structures.
> 
> Tested by compilation and boot of Dom0 with and without profiling enabled.

Not keen on exposing the lock_profile_funcs[] stuff; that and a few other
aspects are ratehr clumsy looking. I would rather dynamically
register/unregister lock-profile blocks; something like:
   /* domain creation */
   lock_profile_register_struct(
       LOCKPROF_TYPE_PERDOMAIN, d, d->domid);
   /* domain destruction */
   lock_profile_unregister_struct(d);
Requires a 'struct lock_profile lock_profile' or similar to be declared
within the domain structure. No need to hide that definition within a macro
I expect, nor to specify in advance the 'max # locks'.

Where lock_profile is a struct containing the 'idx' (domid in this case) and
the head of a linked list of locks, which gets appended to whenever a new
pofiled lock is declared. All locks can go back to having the entire profile
struct embedded within them - I don't care about the space wastage for
unprofiled locks. This means that lock_profile is a fixed-size struct and
does not need to contain the profiling data itself -- it just points at a
chain of locks that extends on each spin_lock_init_prof().

lock_profile_register_struct() adds the lock_profile_struct to a linked list
of such structs for this LOCKPROF_TYPE (so you keep an array of list_heads
indexed by LOCKPROF_TYPE, and you can protect that array and all
sub-linked-lists with a big rwlock). Something like:
static DEFINE_RWLOCK lock_profiles_lock;
static struct lock_profile *lock_profiles[NR_LOCKPROF_TYPES]
Or somesuch.

Regarding macro style, crap like starting a multi-statement macro with a
semi-colon, and not terminating it with one (e.g., _spin_lock_init_prof())
is unacceptable style. The C idiom is to enclose multiple statements within
a do{}while(0), or possibly the gcc idiom {()}.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-08  8:38   ` Juergen Gross
  2009-10-08  9:17     ` Keir Fraser
@ 2009-10-09 10:35     ` George Dunlap
  1 sibling, 0 replies; 17+ messages in thread
From: George Dunlap @ 2009-10-09 10:35 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Keir Fraser

On Thu, Oct 8, 2009 at 9:38 AM, Juergen Gross
<juergen.gross@ts.fujitsu.com> wrote:
> It helped me a lot for narrowing down my performance problem with a 8 vcpu
> BS2000 system (this was the main reason I made the patch).
> Finding such a problem in xentrace is quite a bit of work. xentrace produced
> over 500 MB of data in 30 seconds, while reading the xenlockprof output
> revealed the bottleneck in seconds.

Minor point, but IIRC you identified the shadow lock as the problem;
but the shadow lock covers a lot of code.  Did this patch alone tell
you what *aspect* of the shadow code was causing the problem?  That's
what xentrace was designed to do. :-)

I agree, however, that for a lot of uses, xentrace isn't the right
tool.  It does introduce a lot of memory and disk overhead.  Having
this kind of lock profiling, and sample-based profiling like oprofile
or gprof for Xen is the right tool for cases where a low-overhead
aggregate measurement is all that's necessary.

Xentrace excels when aggregates don't give you enough information, and
you need to drill down into specific sequences of events, or see how
things change over time.  It's also useful for situations where it's
inconvenient to run a test iteratively as you add aggregate
information (for instance, if it's a customer running the test or
you're looking at a problem 3-hours into a 6-hour test).  Instead of
having Xen make new aggregates, you can use the analysis tool to make
new aggregates as you need them.

 -George

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Patch] support of lock profiling in Xen
  2009-10-09  9:22                 ` Keir Fraser
@ 2009-10-13 13:56                   ` Juergen Gross
  2009-10-13 15:49                     ` Keir Fraser
  0 siblings, 1 reply; 17+ messages in thread
From: Juergen Gross @ 2009-10-13 13:56 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

[-- Attachment #1: Type: text/plain, Size: 1651 bytes --]

Keir Fraser wrote:
> On 09/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>>> Okay, well this all sounds an acceptable kind of direction. Worth spinning
>>> another patch.
>> Here it is.
>> I managed to avoid the perfc-like stuff completely.
>>
>> Included are 2 global locks defined for profiling and 2 locks in the domain
>> structure.
>>
>> I've added some comments in spinlock.h how to use the profiling in structures.
>>
>> Tested by compilation and boot of Dom0 with and without profiling enabled.
> 
> Not keen on exposing the lock_profile_funcs[] stuff; that and a few other
> aspects are ratehr clumsy looking. I would rather dynamically
> register/unregister lock-profile blocks; something like:
>    /* domain creation */
>    lock_profile_register_struct(
>        LOCKPROF_TYPE_PERDOMAIN, d, d->domid);
>    /* domain destruction */
>    lock_profile_unregister_struct(d);
> Requires a 'struct lock_profile lock_profile' or similar to be declared
> within the domain structure. No need to hide that definition within a macro
> I expect, nor to specify in advance the 'max # locks'.

Round 3 then :-)

I think a rw-lock is not needed, as reading is a very rare event which will
be executed in parallel less then once a year. ;-)

Better?


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@ts.fujitsu.com
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html

[-- Attachment #2: lock_prof.patch --]
[-- Type: text/x-patch, Size: 27565 bytes --]

diff -r 1d7221667204 tools/libxc/xc_misc.c
--- a/tools/libxc/xc_misc.c	Thu Oct 08 09:24:32 2009 +0100
+++ b/tools/libxc/xc_misc.c	Tue Oct 13 15:52:27 2009 +0200
@@ -121,6 +121,30 @@
     return rc;
 }
 
+int xc_lockprof_control(int xc_handle,
+                        uint32_t opcode,
+                        uint32_t *n_elems,
+                        uint64_t *time,
+                        xc_lockprof_data_t *data)
+{
+    int rc;
+    DECLARE_SYSCTL;
+
+    sysctl.cmd = XEN_SYSCTL_lockprof_op;
+    sysctl.u.lockprof_op.cmd = opcode;
+    sysctl.u.lockprof_op.max_elem = n_elems ? *n_elems : 0;
+    set_xen_guest_handle(sysctl.u.lockprof_op.data, data);
+
+    rc = do_sysctl(xc_handle, &sysctl);
+
+    if (n_elems)
+        *n_elems = sysctl.u.lockprof_op.nr_elem;
+    if (time)
+        *time = sysctl.u.lockprof_op.time;
+
+    return rc;
+}
+
 int xc_getcpuinfo(int xc_handle, int max_cpus,
                   xc_cpuinfo_t *info, int *nr_cpus)
 {
diff -r 1d7221667204 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h	Thu Oct 08 09:24:32 2009 +0100
+++ b/tools/libxc/xenctrl.h	Tue Oct 13 15:52:27 2009 +0200
@@ -700,6 +700,14 @@
                      int *nbr_desc,
                      int *nbr_val);
 
+typedef xen_sysctl_lockprof_data_t xc_lockprof_data_t;
+/* IMPORTANT: The caller is responsible for mlock()'ing the @data array. */
+int xc_lockprof_control(int xc_handle,
+                        uint32_t opcode,
+                        uint32_t *n_elems,
+                        uint64_t *time,
+                        xc_lockprof_data_t *data);
+
 /**
  * Memory maps a range within one domain to a local address range.  Mappings
  * should be unmapped with munmap and should follow the same rules as mmap
diff -r 1d7221667204 tools/misc/Makefile
--- a/tools/misc/Makefile	Thu Oct 08 09:24:32 2009 +0100
+++ b/tools/misc/Makefile	Tue Oct 13 15:52:27 2009 +0200
@@ -10,7 +10,7 @@
 
 HDRS     = $(wildcard *.h)
 
-TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat
+TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof
 TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@
 INSTALL_BIN-$(CONFIG_X86) += xen-detect
 INSTALL_BIN := $(INSTALL_BIN-y)
 
-INSTALL_SBIN-y := xm xen-bugtool xen-python-path xend xenperf xsview xenpm xen-tmem-list-parse gtraceview gtracestat
+INSTALL_SBIN-y := xm xen-bugtool xen-python-path xend xenperf xsview xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof
 INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -49,7 +49,7 @@
 %.o: %.c $(HDRS) Makefile
 	$(CC) -c $(CFLAGS) -o $@ $<
 
-xen-hvmctx xenperf xenpm gtracestat: %: %.o Makefile
+xen-hvmctx xenperf xenpm gtracestat xenlockprof: %: %.o Makefile
 	$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) $(LDFLAGS_libxenctrl)
 
 gtraceview: %: %.o Makefile
diff -r 1d7221667204 tools/misc/xenlockprof.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/misc/xenlockprof.c	Tue Oct 13 15:52:27 2009 +0200
@@ -0,0 +1,136 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ ****************************************************************************
+ * (C) 2009 - Juergen Gross - Fujitsu Technology Solutions
+ ****************************************************************************
+ *
+ *        File: xenlockprof.c
+ *      Author: Juergen Gross (juergen.gross@ts.fujitsu.com)
+ *        Date: Oct 2009
+ * 
+ * Description: 
+ */
+
+#include <xenctrl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static int lock_pages(void *addr, size_t len)
+{
+    int e = 0;
+#ifndef __sun__
+    e = mlock(addr, len);
+#endif
+    return (e);
+}
+
+static void unlock_pages(void *addr, size_t len)
+{
+#ifndef __sun__
+    munlock(addr, len);
+#endif
+}
+
+int main(int argc, char *argv[])
+{
+    int                xc_handle;
+    uint32_t           i, j, n;
+    uint64_t           time;
+    double             l, b, sl, sb;
+    char               name[60];
+    xc_lockprof_data_t *data;
+
+    if ((argc > 2) || ((argc == 2) && (strcmp(argv[1], "-r") != 0)))
+    {
+        printf("%s: [-r]\n", argv[0]);
+        printf("no args: print lock profile data\n");
+        printf("    -r : reset profile data\n");
+        return 1;
+    }
+
+    if ((xc_handle = xc_interface_open()) == -1)
+    {
+        fprintf(stderr, "Error opening xc interface: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    if (argc > 1)
+    {
+        if (xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_reset, NULL,
+                                NULL, NULL) != 0)
+        {
+            fprintf(stderr, "Error reseting profile data: %d (%s)\n",
+                    errno, strerror(errno));
+            return 1;
+        }
+        return 1;
+    }
+
+    n = 0;
+    if (xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_query, &n,
+                            NULL, NULL) != 0)
+    {
+        fprintf(stderr, "Error getting number of profile records: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    n += 32;    /* just to be sure */
+    data = malloc(sizeof(*data) * n);
+    if ((data == NULL) || (lock_pages(data, sizeof(*data) * n) != 0))
+    {
+        fprintf(stderr, "Could not alloc or lock buffers: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    i = n;
+    if ( xc_lockprof_control(xc_handle, XEN_SYSCTL_LOCKPROF_query, &i,
+                             &time, data) != 0)
+    {
+        fprintf(stderr, "Error getting profile records: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+
+    unlock_pages(data, sizeof(*data) * n);
+
+    if (i > n)
+    {
+        printf("data incomplete, %d records are missing!\n\n", i - n);
+        i = n;
+    }
+    sl = 0;
+    sb = 0;
+    for (j = 0; j < i; j++)
+    {
+        switch (data[j].type)
+        {
+        case LOCKPROF_TYPE_GLOBAL:
+            sprintf(name, "global lock %s", data[j].name);
+            break;
+        case LOCKPROF_TYPE_PERDOM:
+            sprintf(name, "domain %d lock %s", data[j].idx, data[j].name);
+            break;
+        default:
+            sprintf(name, "unknown type(%d) %d lock %s", data[j].type,
+                data[j].idx, data[j].name);
+            break;
+        }
+        l = (double)(data[j].lock_time) / 1E+09;
+        b = (double)(data[j].block_time) / 1E+09;
+        sl += l;
+        sb += b;
+        printf("%-50s: lock:%12ld(%20.9fs), block:%12ld(%20.9fs)\n",
+            name, data[j].lock_cnt, l, data[j].block_cnt, b);
+    }
+    l = (double)time / 1E+09;
+    printf("total profiling time: %20.9fs\n", l);
+    printf("total locked time:    %20.9fs\n", sl);
+    printf("total blocked time:   %20.9fs\n", sb);
+
+    return 0;
+}
diff -r 1d7221667204 xen/Rules.mk
--- a/xen/Rules.mk	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/Rules.mk	Tue Oct 13 15:52:27 2009 +0200
@@ -6,6 +6,7 @@
 verbose       ?= n
 perfc         ?= n
 perfc_arrays  ?= n
+lock_profile  ?= n
 crash_debug   ?= n
 frame_pointer ?= n
 
@@ -49,6 +50,7 @@
 CFLAGS-$(crash_debug)   += -DCRASH_DEBUG
 CFLAGS-$(perfc)         += -DPERF_COUNTERS
 CFLAGS-$(perfc_arrays)  += -DPERF_ARRAYS
+CFLAGS-$(lock_profile)  += -DLOCK_PROFILE
 CFLAGS-$(frame_pointer) += -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER
 
 ifneq ($(max_phys_cpus),)
diff -r 1d7221667204 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/arch/x86/domain.c	Tue Oct 13 15:52:27 2009 +0200
@@ -32,6 +32,7 @@
 #include <xen/acpi.h>
 #include <xen/pci.h>
 #include <xen/paging.h>
+#include <public/sysctl.h>
 #include <asm/regs.h>
 #include <asm/mc146818rtc.h>
 #include <asm/system.h>
@@ -185,6 +186,7 @@
 
 void free_domain_struct(struct domain *d)
 {
+    lock_profile_deregister_struct(LOCKPROF_TYPE_PERDOM, d);
     free_xenheap_pages(d, get_order_from_bytes(sizeof(*d)));
 }
 
diff -r 1d7221667204 xen/arch/x86/xen.lds.S
--- a/xen/arch/x86/xen.lds.S	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/arch/x86/xen.lds.S	Tue Oct 13 15:52:27 2009 +0200
@@ -62,6 +62,13 @@
        *(.data.read_mostly)
   } :text
 
+#ifdef LOCK_PROFILE
+  . = ALIGN(32);
+  __lock_profile_start = .;
+  .lockprofile.data : { *(.lockprofile.data) } :text
+  __lock_profile_end = .;
+#endif
+
   . = ALIGN(4096);             /* Init code and data */
   __init_begin = .;
   .init.text : {
diff -r 1d7221667204 xen/common/domain.c
--- a/xen/common/domain.c	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/common/domain.c	Tue Oct 13 15:52:27 2009 +0200
@@ -29,6 +29,7 @@
 #include <acpi/cpufreq/cpufreq.h>
 #include <asm/debugger.h>
 #include <public/sched.h>
+#include <public/sysctl.h>
 #include <public/vcpu.h>
 #include <xsm/xsm.h>
 #include <xen/trace.h>
@@ -221,13 +222,15 @@
     memset(d, 0, sizeof(*d));
     d->domain_id = domid;
 
+    lock_profile_register_struct(LOCKPROF_TYPE_PERDOM, d, domid, "Domain");
+
     if ( xsm_alloc_security_domain(d) != 0 )
         goto fail;
     init_status |= INIT_xsm;
 
     atomic_set(&d->refcnt, 1);
-    spin_lock_init(&d->domain_lock);
-    spin_lock_init(&d->page_alloc_lock);
+    spin_lock_init_prof(d, domain_lock);
+    spin_lock_init_prof(d, page_alloc_lock);
     spin_lock_init(&d->shutdown_lock);
     spin_lock_init(&d->hypercall_deadlock_mutex);
     INIT_PAGE_LIST_HEAD(&d->page_list);
diff -r 1d7221667204 xen/common/keyhandler.c
--- a/xen/common/keyhandler.c	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/common/keyhandler.c	Tue Oct 13 15:52:27 2009 +0200
@@ -347,6 +347,20 @@
 };
 #endif
 
+#ifdef LOCK_PROFILE
+extern void spinlock_profile_printall(unsigned char key);
+static struct keyhandler spinlock_printall_keyhandler = {
+    .diagnostic = 1,
+    .u.fn = spinlock_profile_printall,
+    .desc = "print lock profile info"
+};
+extern void spinlock_profile_reset(unsigned char key);
+static struct keyhandler spinlock_reset_keyhandler = {
+    .u.fn = spinlock_profile_reset,
+    .desc = "reset lock profile info"
+};
+#endif
+
 static void run_all_nonirq_keyhandlers(unsigned long unused)
 {
     /* Fire all the non-IRQ-context diagnostic keyhandlers */
@@ -428,6 +442,12 @@
     register_keyhandler('p', &perfc_printall_keyhandler);
     register_keyhandler('P', &perfc_reset_keyhandler);
 #endif
+
+#ifdef LOCK_PROFILE
+    register_keyhandler('l', &spinlock_printall_keyhandler);
+    register_keyhandler('L', &spinlock_reset_keyhandler);
+#endif
+
 }
 
 /*
diff -r 1d7221667204 xen/common/spinlock.c
--- a/xen/common/spinlock.c	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/common/spinlock.c	Tue Oct 13 15:52:27 2009 +0200
@@ -1,7 +1,11 @@
+#include <xen/lib.h>
 #include <xen/config.h>
 #include <xen/irq.h>
 #include <xen/smp.h>
+#include <xen/time.h>
 #include <xen/spinlock.h>
+#include <xen/guest_access.h>
+#include <public/sysctl.h>
 #include <asm/processor.h>
 
 #ifndef NDEBUG
@@ -41,56 +45,97 @@
 
 #endif
 
+#ifdef LOCK_PROFILE
+
+#define LOCK_PROFILE_REL                                               \
+    lock->profile.time_hold += NOW() - lock->profile.time_locked;      \
+    lock->profile.lock_cnt++;
+#define LOCK_PROFILE_VAR    s_time_t block = 0
+#define LOCK_PROFILE_BLOCK  block = block ? : NOW();
+#define LOCK_PROFILE_GOT                                               \
+    lock->profile.time_locked = NOW();                                 \
+    if (block)                                                         \
+    {                                                                  \
+        lock->profile.time_block += lock->profile.time_locked - block; \
+        lock->profile.block_cnt++;                                     \
+    }
+
+#else
+
+#define LOCK_PROFILE_REL
+#define LOCK_PROFILE_VAR
+#define LOCK_PROFILE_BLOCK
+#define LOCK_PROFILE_GOT
+
+#endif
+
 void _spin_lock(spinlock_t *lock)
 {
+    LOCK_PROFILE_VAR;
+
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
+    {
+        LOCK_PROFILE_BLOCK;
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
+    }
+    LOCK_PROFILE_GOT;
 }
 
 void _spin_lock_irq(spinlock_t *lock)
 {
+    LOCK_PROFILE_VAR;
+
     ASSERT(local_irq_is_enabled());
     local_irq_disable();
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
     {
+        LOCK_PROFILE_BLOCK;
         local_irq_enable();
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
         local_irq_disable();
     }
+    LOCK_PROFILE_GOT;
 }
 
 unsigned long _spin_lock_irqsave(spinlock_t *lock)
 {
     unsigned long flags;
+    LOCK_PROFILE_VAR;
+
     local_irq_save(flags);
     check_lock(&lock->debug);
     while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
     {
+        LOCK_PROFILE_BLOCK;
         local_irq_restore(flags);
         while ( likely(_raw_spin_is_locked(&lock->raw)) )
             cpu_relax();
         local_irq_save(flags);
     }
+    LOCK_PROFILE_GOT;
     return flags;
 }
 
 void _spin_unlock(spinlock_t *lock)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
 }
 
 void _spin_unlock_irq(spinlock_t *lock)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
     local_irq_enable();
 }
 
 void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
 {
+    LOCK_PROFILE_REL;
     _raw_spin_unlock(&lock->raw);
     local_irq_restore(flags);
 }
@@ -104,13 +149,32 @@
 int _spin_trylock(spinlock_t *lock)
 {
     check_lock(&lock->debug);
+#ifndef LOCK_PROFILE
     return _raw_spin_trylock(&lock->raw);
+#else
+    if (!_raw_spin_trylock(&lock->raw)) return 0;
+    lock->profile.time_locked = NOW();
+    return 1;
+#endif
 }
 
 void _spin_barrier(spinlock_t *lock)
 {
+#ifdef LOCK_PROFILE
+    s_time_t block = NOW();
+    u64      loop = 0;
+
+    check_lock(&lock->debug);
+    do { mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
+    if (loop > 1)
+    {
+        lock->profile.time_block += NOW() - block;
+        lock->profile.block_cnt++;
+    }
+#else
     check_lock(&lock->debug);
     do { mb(); } while ( _raw_spin_is_locked(&lock->raw) );
+#endif
     mb();
 }
 
@@ -248,3 +312,197 @@
     check_lock(&lock->debug);
     return _raw_rw_is_write_locked(&lock->raw);
 }
+
+#ifdef LOCK_PROFILE
+struct lock_profile_anc {
+    struct lock_profile_qhead *head_q;   /* first head of this type */
+    char                      *name;     /* descriptive string for print */
+};
+
+typedef void lock_profile_subfunc(struct lock_profile *, int32_t, int32_t,
+                                  void *);
+
+extern struct lock_profile *__lock_profile_start;
+extern struct lock_profile *__lock_profile_end;
+
+static s_time_t lock_profile_start = 0;
+static struct lock_profile_anc lock_profile_ancs[LOCKPROF_TYPE_N];
+static struct lock_profile_qhead lock_profile_glb_q;
+static spinlock_t lock_profile_lock = SPIN_LOCK_UNLOCKED;
+
+static void spinlock_profile_iterate(lock_profile_subfunc *sub, void *par)
+{
+    int  i;
+    struct lock_profile_qhead *hq;
+    struct lock_profile *eq;
+
+    spin_lock(&lock_profile_lock);
+    for (i = 0; i < LOCKPROF_TYPE_N; i++)
+    {
+        for (hq = lock_profile_ancs[i].head_q; hq; hq = hq->head_q)
+        {
+            for (eq = hq->elem_q; eq; eq = eq->next)
+            {
+                sub(eq, i, hq->idx, par);
+            }
+        }
+    }
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+static void spinlock_profile_print_elem(struct lock_profile *data,
+    int32_t type, int32_t idx, void *par)
+{
+    if (type == LOCKPROF_TYPE_GLOBAL)
+        printk("%s %s:\n", lock_profile_ancs[idx].name, data->name);
+    else
+        printk("%s %d %s:\n", lock_profile_ancs[idx].name, idx, data->name);
+    printk("  lock:%12ld(%08X:%08X), block:%12ld(%08X:%08X)\n",
+        data->lock_cnt, (u32)(data->time_hold >> 32), (u32)data->time_hold,
+        data->block_cnt, (u32)(data->time_block >> 32), (u32)data->time_block);
+    return;
+}
+
+void spinlock_profile_printall(unsigned char key)
+{
+    s_time_t now = NOW();
+    s_time_t diff;
+
+    diff = now - lock_profile_start;
+    printk("Xen lock profile info SHOW  (now = %08X:%08X, "
+        "total = %08X:%08X)\n", (u32)(now>>32), (u32)now,
+        (u32)(diff>>32), (u32)diff);
+    spinlock_profile_iterate(spinlock_profile_print_elem, NULL);
+    return;
+}
+
+static void spinlock_profile_reset_elem(struct lock_profile *data,
+    int32_t type, int32_t idx, void *par)
+{
+    data->lock_cnt = 0;
+    data->block_cnt = 0;
+    data->time_hold = 0;
+    data->time_block = 0;
+    return;
+}
+
+void spinlock_profile_reset(unsigned char key)
+{
+    s_time_t now = NOW();
+
+    if ( key != '\0' )
+        printk("Xen lock profile info RESET (now = %08X:%08X)\n",
+            (u32)(now>>32), (u32)now);
+    lock_profile_start = now;
+    spinlock_profile_iterate(spinlock_profile_reset_elem, NULL);
+    return;
+}
+
+typedef struct {
+    xen_sysctl_lockprof_op_t *pc;
+    int                      rc;
+} spinlock_profile_ucopy_t;
+
+static void spinlock_profile_ucopy_elem(struct lock_profile *data,
+    int32_t type, int32_t idx, void *par)
+{
+    spinlock_profile_ucopy_t *p;
+    xen_sysctl_lockprof_data_t elem;
+
+    p = (spinlock_profile_ucopy_t *)par;
+    if (p->rc)
+        return;
+
+    if (p->pc->nr_elem < p->pc->max_elem)
+    {
+        safe_strcpy(elem.name, data->name);
+        elem.type = type;
+        elem.idx = idx;
+        elem.lock_cnt = data->lock_cnt;
+        elem.block_cnt = data->block_cnt;
+        elem.lock_time = data->time_hold;
+        elem.block_time = data->time_block;
+        if (copy_to_guest_offset(p->pc->data, p->pc->nr_elem, &elem, 1))
+        {
+            p->rc = -EFAULT;
+            return;
+        }
+    }
+    p->pc->nr_elem++;
+    
+    return;
+}
+
+/* Dom0 control of lock profiling */
+int spinlock_profile_control(xen_sysctl_lockprof_op_t *pc)
+{
+    int rc;
+    spinlock_profile_ucopy_t par;
+
+    rc = 0;
+    switch (pc->cmd)
+    {
+    case XEN_SYSCTL_LOCKPROF_reset:
+        spinlock_profile_reset('\0');
+        break;
+    case XEN_SYSCTL_LOCKPROF_query:
+	pc->nr_elem = 0;
+	par.rc = 0;
+	par.pc = pc;
+        spinlock_profile_iterate(spinlock_profile_ucopy_elem, &par);
+        pc->time = NOW() - lock_profile_start;
+	rc = par.rc;
+        break;
+    default:
+        rc = -EINVAL;
+        break;
+    }
+    return rc;
+}
+
+void _lock_profile_register_struct(int32_t type,
+    struct lock_profile_qhead *qhead, int32_t idx, char *name)
+{
+    qhead->idx = idx;
+    spin_lock(&lock_profile_lock);
+    qhead->head_q = lock_profile_ancs[type].head_q;
+    lock_profile_ancs[type].head_q = qhead;
+    lock_profile_ancs[type].name = name;
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+void _lock_profile_deregister_struct(int32_t type,
+    struct lock_profile_qhead *qhead)
+{
+    struct lock_profile_qhead **q;
+
+    spin_lock(&lock_profile_lock);
+    for (q = &lock_profile_ancs[type].head_q; *q; q = &((*q)->head_q))
+    {
+        if (*q == qhead)
+        {
+            *q = qhead->head_q;
+            break;
+        }
+    }
+    spin_unlock(&lock_profile_lock);
+    return;
+}
+
+static int __init lock_prof_init(void)
+{
+    struct lock_profile **q;
+
+    for (q = &__lock_profile_start; q < &__lock_profile_end; q++)
+    {
+        (*q)->next = lock_profile_glb_q.elem_q;
+	lock_profile_glb_q.elem_q = *q;
+    }
+    _lock_profile_register_struct(LOCKPROF_TYPE_GLOBAL, &lock_profile_glb_q,
+        0, "Global lock");
+    return 0;
+}
+__initcall(lock_prof_init);
+#endif
diff -r 1d7221667204 xen/common/sysctl.c
--- a/xen/common/sysctl.c	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/common/sysctl.c	Tue Oct 13 15:52:27 2009 +0200
@@ -29,6 +29,9 @@
 
 extern long arch_do_sysctl(
     struct xen_sysctl *op, XEN_GUEST_HANDLE(xen_sysctl_t) u_sysctl);
+#ifdef LOCK_PROFILE
+extern int spinlock_profile_control(xen_sysctl_lockprof_op_t *pc);
+#endif
 
 long do_sysctl(XEN_GUEST_HANDLE(xen_sysctl_t) u_sysctl)
 {
@@ -144,6 +147,15 @@
     break;
 #endif
 
+#ifdef LOCK_PROFILE
+    case XEN_SYSCTL_lockprof_op:
+    {
+        ret = spinlock_profile_control(&op->u.lockprof_op);
+        if ( copy_to_guest(u_sysctl, op, 1) )
+            ret = -EFAULT;
+    }
+    break;
+#endif
     case XEN_SYSCTL_debug_keys:
     {
         char c;
diff -r 1d7221667204 xen/include/public/sysctl.h
--- a/xen/include/public/sysctl.h	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/include/public/sysctl.h	Tue Oct 13 15:52:27 2009 +0200
@@ -454,6 +454,38 @@
 
 #define PG_OFFLINE_OWNER_SHIFT 16
 
+#define XEN_SYSCTL_lockprof_op       11
+/* Sub-operations: */
+#define XEN_SYSCTL_LOCKPROF_reset 1   /* Reset all profile data to zero. */
+#define XEN_SYSCTL_LOCKPROF_query 2   /* Get lock profile information. */
+/* Record-type: */
+#define LOCKPROF_TYPE_GLOBAL      0   /* global lock, idx meaningless */
+#define LOCKPROF_TYPE_PERDOM      1   /* per-domain lock, idx is domid */
+#define LOCKPROF_TYPE_N           2   /* number of types */
+struct xen_sysctl_lockprof_data {
+    char     name[40];     /* lock name (may include up to 2 %d specifiers) */
+    int32_t  type;         /* LOCKPROF_TYPE_??? */
+    int32_t  idx;          /* index (e.g. domain id) */
+    uint64_t lock_cnt;     /* # of locking succeeded */
+    uint64_t block_cnt;    /* # of wait for lock */
+    uint64_t lock_time;    /* nsecs lock held */
+    uint64_t block_time;   /* nsecs waited for lock */
+};
+typedef struct xen_sysctl_lockprof_data xen_sysctl_lockprof_data_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_data_t);
+struct xen_sysctl_lockprof_op {
+    /* IN variables. */
+    uint32_t       cmd;               /* XEN_SYSCTL_LOCKPROF_??? */
+    uint32_t       max_elem;          /* size of output buffer */
+    /* OUT variables (query only). */
+    uint32_t       nr_elem;           /* number of elements available */
+    uint64_t       time;              /* nsecs of profile measurement */
+    /* profile information (or NULL) */
+    XEN_GUEST_HANDLE_64(xen_sysctl_lockprof_data_t) data;
+};
+typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
+
 struct xen_sysctl {
     uint32_t cmd;
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
@@ -471,6 +503,7 @@
         struct xen_sysctl_cpu_hotplug       cpu_hotplug;
         struct xen_sysctl_pm_op             pm_op;
         struct xen_sysctl_page_offline_op   page_offline;
+        struct xen_sysctl_lockprof_op       lockprof_op;
         uint8_t                             pad[128];
     } u;
 };
diff -r 1d7221667204 xen/include/xen/sched.h
--- a/xen/include/xen/sched.h	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/include/xen/sched.h	Tue Oct 13 15:52:27 2009 +0200
@@ -276,6 +276,8 @@
 
     /* transcendent memory, auto-allocated on first tmem op by each domain */
     void *tmem;
+
+    struct lock_profile_qhead profile_head;
 };
 
 struct domain_setup_info
diff -r 1d7221667204 xen/include/xen/spinlock.h
--- a/xen/include/xen/spinlock.h	Thu Oct 08 09:24:32 2009 +0100
+++ b/xen/include/xen/spinlock.h	Tue Oct 13 15:52:27 2009 +0200
@@ -19,16 +19,109 @@
 #define spin_debug_disable() ((void)0)
 #endif
 
+#ifdef LOCK_PROFILE
+/*
+    lock profiling on:
+
+    Global locks which should be subject to profiling must be declared via
+    DEFINE_SPINLOCK.
+
+    For locks in structures further measures are necessary:
+    - the structure definition must include a profile_head with exactly this
+      name:
+
+      struct lock_profile_qhead   profile_head;
+
+    - the single locks which are subject to profiling have to be initialized
+      via
+
+      spin_lock_init_prof(ptr, lock);
+
+      with ptr being the main structure pointer and lock the spinlock field
+
+    - each structure has to be added to profiling with
+
+      lock_profile_register_struct(type, ptr, idx, print);
+
+      with:
+        type:  something like LOCKPROF_TYPE_PERDOM
+        ptr:   pointer to the structure
+        idx:   index of that structure, e.g. domid
+        print: descriptive string like "domain"
+
+    - removing of a structure is done via
+
+      lock_profile_deregister_struct(type, ptr);
+*/
+
+struct lock_profile {
+    struct lock_profile *next;       /* forward link */
+    char                *name;       /* lock name */
+    u64                 lock_cnt;    /* # of complete locking ops */
+    u64                 block_cnt;   /* # of complete wait for lock */
+    s64                 time_hold;   /* cumulated lock time */
+    s64                 time_block;  /* cumulated wait time */
+    s64                 time_locked; /* system time of last locking */
+};
+
+struct lock_profile_qhead {
+    struct lock_profile_qhead *head_q; /* next head of this type */
+    struct lock_profile       *elem_q; /* first element in q */
+    int32_t                   idx;     /* index for printout */
+};
+
+#define _LOCK_PROFILE(name) { 0, name, 0, 0, 0, 0, 0 }
+#define _LOCK_NO_PROFILE _LOCK_PROFILE(NULL)
+#define _LOCK_PROFILE_PTR(name)                                               \
+    static struct lock_profile *__lock_profile_##name __attribute_used__      \
+    __attribute__ ((__section__(".lockprofile.data"))) = &name.profile
+#define _SPIN_LOCK_UNLOCKED(x) { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0,          \
+                                 _LOCK_DEBUG, x }
+#define SPIN_LOCK_UNLOCKED _SPIN_LOCK_UNLOCKED(_LOCK_NO_PROFILE)
+#define DEFINE_SPINLOCK(l)                                                    \
+    spinlock_t l = _SPIN_LOCK_UNLOCKED(_LOCK_PROFILE(#l));                    \
+    _LOCK_PROFILE_PTR(l)
+
+#define spin_lock_init_prof(s, l)                                             \
+    do {                                                                      \
+        (s)->l = (spinlock_t)_SPIN_LOCK_UNLOCKED(_LOCK_PROFILE(#l));          \
+	(s)->l.profile.next = (s)->profile_head.elem_q;                       \
+	(s)->profile_head.elem_q = &((s)->l.profile);                         \
+    } while(0)
+
+void _lock_profile_register_struct(int32_t, struct lock_profile_qhead *,      \
+                                   int32_t, char *);
+void _lock_profile_deregister_struct(int32_t, struct lock_profile_qhead *);
+
+#define lock_profile_register_struct(type, ptr, idx, print)                   \
+    _lock_profile_register_struct(type, &((ptr)->profile_head), idx, print)
+#define lock_profile_deregister_struct(type, ptr)                             \
+    _lock_profile_deregister_struct(type, &((ptr)->profile_head))
+
+#else
+
+struct lock_profile { };
+struct lock_profile_qhead { };
+
+#define SPIN_LOCK_UNLOCKED                                                    \
+    { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG, { } }
+#define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED
+
+#define spin_lock_init_prof(s, l) spin_lock_init(&((s)->l))
+#define lock_profile_register_struct(type, ptr, idx, print)
+#define lock_profile_deregister_struct(type, ptr)
+
+#endif
+
 typedef struct {
     raw_spinlock_t raw;
     u16 recurse_cpu:12;
     u16 recurse_cnt:4;
     struct lock_debug debug;
+    struct lock_profile profile;
 } spinlock_t;
 
 
-#define SPIN_LOCK_UNLOCKED { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG }
-#define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED
 #define spin_lock_init(l) (*(l) = (spinlock_t)SPIN_LOCK_UNLOCKED)
 
 typedef struct {

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch] support of lock profiling in Xen
  2009-10-13 13:56                   ` Juergen Gross
@ 2009-10-13 15:49                     ` Keir Fraser
  0 siblings, 0 replies; 17+ messages in thread
From: Keir Fraser @ 2009-10-13 15:49 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 13/10/2009 14:56, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> Requires a 'struct lock_profile lock_profile' or similar to be declared
>> within the domain structure. No need to hide that definition within a macro
>> I expect, nor to specify in advance the 'max # locks'.
> 
> Round 3 then :-)
> 
> I think a rw-lock is not needed, as reading is a very rare event which will
> be executed in parallel less then once a year. ;-)
> 
> Better?

Yeah, this is fine I think.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-10-13 15:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-08  7:48 [Patch] support of lock profiling in Xen Juergen Gross
2009-10-08  8:15 ` Keir Fraser
2009-10-08  8:27   ` Juergen Gross
2009-10-08  8:38     ` Keir Fraser
2009-10-08  8:40     ` Keir Fraser
2009-10-08  9:04       ` Juergen Gross
2009-10-08  9:21         ` Keir Fraser
2009-10-08  9:35           ` Juergen Gross
2009-10-08  9:47             ` Keir Fraser
2009-10-09  8:27               ` Juergen Gross
2009-10-09  9:22                 ` Keir Fraser
2009-10-13 13:56                   ` Juergen Gross
2009-10-13 15:49                     ` Keir Fraser
2009-10-08  8:21 ` Keir Fraser
2009-10-08  8:38   ` Juergen Gross
2009-10-08  9:17     ` Keir Fraser
2009-10-09 10:35     ` George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.