public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Improve KVM per VM monitoring
@ 2016-05-18 11:26 Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 1/5] tools: Add kvm_stat vm monitor script Janosch Frank
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Janosch Frank @ 2016-05-18 11:26 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, dan.carpenter, frankja

This patchset introduces KVM per VM exit statistics monitoring via
debugfs, as well as moves a tool to display VM statistics from qemu to
tools/.

The new debugfs per VM statistics are an alternative to the already
available VM tracepoints. They are easier to read and have low
overhead.

The kvm_stat python script is moved to the kernel, as we can make sure
here that the right version of the script is used with the right
kernel version. This is not given for qemu, as it supports a wide
range of linux kernel versions.

v1 -> v2
   * Rebase
   * Added kvm_stat to tools Makefile

Janosch Frank (5):
  tools: Add kvm_stat vm monitor script
  MAINTAINERS: Add kvm tools
  KVM: Create debugfs dir and stat files for each VM
  tools: kvm_stat: Introduce pid monitoring
  tools: kvm_stat: Add documentation

 MAINTAINERS                 |    1 +
 include/linux/kvm_host.h    |    7 +
 tools/Makefile              |    6 +-
 tools/kvm/kvm_stat/Makefile |    5 +
 tools/kvm/kvm_stat/kvm_stat | 1125 +++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c         |  187 ++++++-
 6 files changed, 1320 insertions(+), 11 deletions(-)
 create mode 100644 tools/kvm/kvm_stat/Makefile
 create mode 100755 tools/kvm/kvm_stat/kvm_stat

-- 
2.3.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/5] tools: Add kvm_stat vm monitor script
  2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
@ 2016-05-18 11:26 ` Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 2/5] MAINTAINERS: Add kvm tools Janosch Frank
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Janosch Frank @ 2016-05-18 11:26 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, dan.carpenter, frankja

This tool displays kvm vm exit statistics to ease vm monitoring. It
takes its data from the kvm debugfs files or the vm tracepoints and
outputs them as a curses ui or simple text.

It was moved from qemu, as it is dependent on the kernel whereas qemu
works with a large number of kernel versions, some of which may break
the script.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
---
 tools/Makefile              |   6 +-
 tools/kvm/kvm_stat/Makefile |   5 +
 tools/kvm/kvm_stat/kvm_stat | 825 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 835 insertions(+), 1 deletion(-)
 create mode 100644 tools/kvm/kvm_stat/Makefile
 create mode 100755 tools/kvm/kvm_stat/kvm_stat

diff --git a/tools/Makefile b/tools/Makefile
index 60c7e6c..0619446 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -16,6 +16,7 @@ help:
 	@echo '  gpio                   - GPIO tools'
 	@echo '  hv                     - tools used when in Hyper-V clients'
 	@echo '  iio                    - IIO tools'
+	@echo '  kvm_stat               - top-like utility for displaying kvm statistics'
 	@echo '  lguest                 - a minimal 32-bit x86 hypervisor'
 	@echo '  net                    - misc networking tools'
 	@echo '  perf                   - Linux performance measurement and analysis tool'
@@ -110,10 +111,13 @@ tmon_install:
 freefall_install:
 	$(call descend,laptop/$(@:_install=),install)
 
+kvm_stat_install:
+	$(call descend,kvm/$(@:_install=),install)
+
 install: acpi_install cgroup_install cpupower_install hv_install firewire_install lguest_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install vm_install net_install x86_energy_perf_policy_install \
-		tmon_install freefall_install objtool_install
+		tmon_install freefall_install objtool_install kvm_stat_install
 
 acpi_clean:
 	$(call descend,power/acpi,clean)
diff --git a/tools/kvm/kvm_stat/Makefile b/tools/kvm/kvm_stat/Makefile
new file mode 100644
index 0000000..f445f1e
--- /dev/null
+++ b/tools/kvm/kvm_stat/Makefile
@@ -0,0 +1,5 @@
+BINDIR=usr/bin
+
+install:
+	- mkdir -p $(INSTALL_ROOT)/$(BINDIR)
+	- install -m 755 -p "kvm_stat" "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"
diff --git a/tools/kvm/kvm_stat/kvm_stat b/tools/kvm/kvm_stat/kvm_stat
new file mode 100755
index 0000000..769d884
--- /dev/null
+++ b/tools/kvm/kvm_stat/kvm_stat
@@ -0,0 +1,825 @@
+#!/usr/bin/python
+#
+# top-like utility for displaying kvm statistics
+#
+# Copyright 2006-2008 Qumranet Technologies
+# Copyright 2008-2011 Red Hat, Inc.
+#
+# Authors:
+#  Avi Kivity <avi@redhat.com>
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+
+import curses
+import sys
+import os
+import time
+import optparse
+import ctypes
+import fcntl
+import resource
+import struct
+import re
+from collections import defaultdict
+from time import sleep
+
+VMX_EXIT_REASONS = {
+    'EXCEPTION_NMI':        0,
+    'EXTERNAL_INTERRUPT':   1,
+    'TRIPLE_FAULT':         2,
+    'PENDING_INTERRUPT':    7,
+    'NMI_WINDOW':           8,
+    'TASK_SWITCH':          9,
+    'CPUID':                10,
+    'HLT':                  12,
+    'INVLPG':               14,
+    'RDPMC':                15,
+    'RDTSC':                16,
+    'VMCALL':               18,
+    'VMCLEAR':              19,
+    'VMLAUNCH':             20,
+    'VMPTRLD':              21,
+    'VMPTRST':              22,
+    'VMREAD':               23,
+    'VMRESUME':             24,
+    'VMWRITE':              25,
+    'VMOFF':                26,
+    'VMON':                 27,
+    'CR_ACCESS':            28,
+    'DR_ACCESS':            29,
+    'IO_INSTRUCTION':       30,
+    'MSR_READ':             31,
+    'MSR_WRITE':            32,
+    'INVALID_STATE':        33,
+    'MWAIT_INSTRUCTION':    36,
+    'MONITOR_INSTRUCTION':  39,
+    'PAUSE_INSTRUCTION':    40,
+    'MCE_DURING_VMENTRY':   41,
+    'TPR_BELOW_THRESHOLD':  43,
+    'APIC_ACCESS':          44,
+    'EPT_VIOLATION':        48,
+    'EPT_MISCONFIG':        49,
+    'WBINVD':               54,
+    'XSETBV':               55,
+    'APIC_WRITE':           56,
+    'INVPCID':              58,
+}
+
+SVM_EXIT_REASONS = {
+    'READ_CR0':       0x000,
+    'READ_CR3':       0x003,
+    'READ_CR4':       0x004,
+    'READ_CR8':       0x008,
+    'WRITE_CR0':      0x010,
+    'WRITE_CR3':      0x013,
+    'WRITE_CR4':      0x014,
+    'WRITE_CR8':      0x018,
+    'READ_DR0':       0x020,
+    'READ_DR1':       0x021,
+    'READ_DR2':       0x022,
+    'READ_DR3':       0x023,
+    'READ_DR4':       0x024,
+    'READ_DR5':       0x025,
+    'READ_DR6':       0x026,
+    'READ_DR7':       0x027,
+    'WRITE_DR0':      0x030,
+    'WRITE_DR1':      0x031,
+    'WRITE_DR2':      0x032,
+    'WRITE_DR3':      0x033,
+    'WRITE_DR4':      0x034,
+    'WRITE_DR5':      0x035,
+    'WRITE_DR6':      0x036,
+    'WRITE_DR7':      0x037,
+    'EXCP_BASE':      0x040,
+    'INTR':           0x060,
+    'NMI':            0x061,
+    'SMI':            0x062,
+    'INIT':           0x063,
+    'VINTR':          0x064,
+    'CR0_SEL_WRITE':  0x065,
+    'IDTR_READ':      0x066,
+    'GDTR_READ':      0x067,
+    'LDTR_READ':      0x068,
+    'TR_READ':        0x069,
+    'IDTR_WRITE':     0x06a,
+    'GDTR_WRITE':     0x06b,
+    'LDTR_WRITE':     0x06c,
+    'TR_WRITE':       0x06d,
+    'RDTSC':          0x06e,
+    'RDPMC':          0x06f,
+    'PUSHF':          0x070,
+    'POPF':           0x071,
+    'CPUID':          0x072,
+    'RSM':            0x073,
+    'IRET':           0x074,
+    'SWINT':          0x075,
+    'INVD':           0x076,
+    'PAUSE':          0x077,
+    'HLT':            0x078,
+    'INVLPG':         0x079,
+    'INVLPGA':        0x07a,
+    'IOIO':           0x07b,
+    'MSR':            0x07c,
+    'TASK_SWITCH':    0x07d,
+    'FERR_FREEZE':    0x07e,
+    'SHUTDOWN':       0x07f,
+    'VMRUN':          0x080,
+    'VMMCALL':        0x081,
+    'VMLOAD':         0x082,
+    'VMSAVE':         0x083,
+    'STGI':           0x084,
+    'CLGI':           0x085,
+    'SKINIT':         0x086,
+    'RDTSCP':         0x087,
+    'ICEBP':          0x088,
+    'WBINVD':         0x089,
+    'MONITOR':        0x08a,
+    'MWAIT':          0x08b,
+    'MWAIT_COND':     0x08c,
+    'XSETBV':         0x08d,
+    'NPF':            0x400,
+}
+
+# EC definition of HSR (from arch/arm64/include/asm/kvm_arm.h)
+AARCH64_EXIT_REASONS = {
+    'UNKNOWN':      0x00,
+    'WFI':          0x01,
+    'CP15_32':      0x03,
+    'CP15_64':      0x04,
+    'CP14_MR':      0x05,
+    'CP14_LS':      0x06,
+    'FP_ASIMD':     0x07,
+    'CP10_ID':      0x08,
+    'CP14_64':      0x0C,
+    'ILL_ISS':      0x0E,
+    'SVC32':        0x11,
+    'HVC32':        0x12,
+    'SMC32':        0x13,
+    'SVC64':        0x15,
+    'HVC64':        0x16,
+    'SMC64':        0x17,
+    'SYS64':        0x18,
+    'IABT':         0x20,
+    'IABT_HYP':     0x21,
+    'PC_ALIGN':     0x22,
+    'DABT':         0x24,
+    'DABT_HYP':     0x25,
+    'SP_ALIGN':     0x26,
+    'FP_EXC32':     0x28,
+    'FP_EXC64':     0x2C,
+    'SERROR':       0x2F,
+    'BREAKPT':      0x30,
+    'BREAKPT_HYP':  0x31,
+    'SOFTSTP':      0x32,
+    'SOFTSTP_HYP':  0x33,
+    'WATCHPT':      0x34,
+    'WATCHPT_HYP':  0x35,
+    'BKPT32':       0x38,
+    'VECTOR32':     0x3A,
+    'BRK64':        0x3C,
+}
+
+# From include/uapi/linux/kvm.h, KVM_EXIT_xxx
+USERSPACE_EXIT_REASONS = {
+    'UNKNOWN':          0,
+    'EXCEPTION':        1,
+    'IO':               2,
+    'HYPERCALL':        3,
+    'DEBUG':            4,
+    'HLT':              5,
+    'MMIO':             6,
+    'IRQ_WINDOW_OPEN':  7,
+    'SHUTDOWN':         8,
+    'FAIL_ENTRY':       9,
+    'INTR':             10,
+    'SET_TPR':          11,
+    'TPR_ACCESS':       12,
+    'S390_SIEIC':       13,
+    'S390_RESET':       14,
+    'DCR':              15,
+    'NMI':              16,
+    'INTERNAL_ERROR':   17,
+    'OSI':              18,
+    'PAPR_HCALL':       19,
+    'S390_UCONTROL':    20,
+    'WATCHDOG':         21,
+    'S390_TSCH':        22,
+    'EPR':              23,
+    'SYSTEM_EVENT':     24,
+}
+
+IOCTL_NUMBERS = {
+    'SET_FILTER':  0x40082406,
+    'ENABLE':      0x00002400,
+    'DISABLE':     0x00002401,
+    'RESET':       0x00002403,
+}
+
+class Arch(object):
+    """Class that encapsulates global architecture specific data like
+    syscall and ioctl numbers.
+
+    """
+    @staticmethod
+    def get_arch():
+        machine = os.uname()[4]
+
+        if machine.startswith('ppc'):
+            return ArchPPC()
+        elif machine.startswith('aarch64'):
+            return ArchA64()
+        elif machine.startswith('s390'):
+            return ArchS390()
+        else:
+            # X86_64
+            for line in open('/proc/cpuinfo'):
+                if not line.startswith('flags'):
+                    continue
+
+                flags = line.split()
+                if 'vmx' in flags:
+                    return ArchX86(VMX_EXIT_REASONS)
+                if 'svm' in flags:
+                    return ArchX86(SVM_EXIT_REASONS)
+                return
+
+class ArchX86(Arch):
+    def __init__(self, exit_reasons):
+        self.sc_perf_evt_open = 298
+        self.ioctl_numbers = IOCTL_NUMBERS
+        self.exit_reasons = exit_reasons
+
+class ArchPPC(Arch):
+    def __init__(self):
+        self.sc_perf_evt_open = 319
+        self.ioctl_numbers = IOCTL_NUMBERS
+        self.ioctl_numbers['ENABLE'] = 0x20002400
+        self.ioctl_numbers['DISABLE'] = 0x20002401
+
+        # PPC comes in 32 and 64 bit and some generated ioctl
+        # numbers depend on the wordsize.
+        char_ptr_size = ctypes.sizeof(ctypes.c_char_p)
+        self.ioctl_numbers['SET_FILTER'] = 0x80002406 | char_ptr_size << 16
+
+class ArchA64(Arch):
+    def __init__(self):
+        self.sc_perf_evt_open = 241
+        self.ioctl_numbers = IOCTL_NUMBERS
+        self.exit_reasons = AARCH64_EXIT_REASONS
+
+class ArchS390(Arch):
+    def __init__(self):
+        self.sc_perf_evt_open = 331
+        self.ioctl_numbers = IOCTL_NUMBERS
+        self.exit_reasons = None
+
+ARCH = Arch.get_arch()
+
+
+def walkdir(path):
+    """Returns os.walk() data for specified directory.
+
+    As it is only a wrapper it returns the same 3-tuple of (dirpath,
+    dirnames, filenames).
+    """
+    return next(os.walk(path))
+
+
+def parse_int_list(list_string):
+    """Returns an int list from a string of comma separated integers and
+    integer ranges."""
+    integers = []
+    members = list_string.split(',')
+
+    for member in members:
+        if '-' not in member:
+            integers.append(int(member))
+        else:
+            int_range = member.split('-')
+            integers.extend(range(int(int_range[0]),
+                                  int(int_range[1]) + 1))
+
+    return integers
+
+
+def get_online_cpus():
+    with open('/sys/devices/system/cpu/online') as cpu_list:
+        cpu_string = cpu_list.readline()
+        return parse_int_list(cpu_string)
+
+
+def get_filters():
+    filters = {}
+    filters['kvm_userspace_exit'] = ('reason', USERSPACE_EXIT_REASONS)
+    if ARCH.exit_reasons:
+        filters['kvm_exit'] = ('exit_reason', ARCH.exit_reasons)
+    return filters
+
+libc = ctypes.CDLL('libc.so.6', use_errno=True)
+syscall = libc.syscall
+
+class perf_event_attr(ctypes.Structure):
+    _fields_ = [('type', ctypes.c_uint32),
+                ('size', ctypes.c_uint32),
+                ('config', ctypes.c_uint64),
+                ('sample_freq', ctypes.c_uint64),
+                ('sample_type', ctypes.c_uint64),
+                ('read_format', ctypes.c_uint64),
+                ('flags', ctypes.c_uint64),
+                ('wakeup_events', ctypes.c_uint32),
+                ('bp_type', ctypes.c_uint32),
+                ('bp_addr', ctypes.c_uint64),
+                ('bp_len', ctypes.c_uint64),
+                ]
+
+    def __init__(self):
+        super(self.__class__, self).__init__()
+        self.type = PERF_TYPE_TRACEPOINT
+        self.size = ctypes.sizeof(self)
+        self.read_format = PERF_FORMAT_GROUP
+
+def perf_event_open(attr, pid, cpu, group_fd, flags):
+    return syscall(ARCH.sc_perf_evt_open, ctypes.pointer(attr),
+                   ctypes.c_int(pid), ctypes.c_int(cpu),
+                   ctypes.c_int(group_fd), ctypes.c_long(flags))
+
+PERF_TYPE_TRACEPOINT = 2
+PERF_FORMAT_GROUP = 1 << 3
+
+PATH_DEBUGFS_TRACING = '/sys/kernel/debug/tracing'
+PATH_DEBUGFS_KVM = '/sys/kernel/debug/kvm'
+
+class Group(object):
+    def __init__(self):
+        self.events = []
+
+    def add_event(self, event):
+        self.events.append(event)
+
+    def read(self):
+        length = 8 * (1 + len(self.events))
+        read_format = 'xxxxxxxx' + 'Q' * len(self.events)
+        return dict(zip([event.name for event in self.events],
+                        struct.unpack(read_format,
+                                      os.read(self.events[0].fd, length))))
+
+class Event(object):
+    def __init__(self, name, group, trace_cpu, trace_point, trace_filter,
+                 trace_set='kvm'):
+        self.name = name
+        self.fd = None
+        self.setup_event(group, trace_cpu, trace_point, trace_filter,
+                         trace_set)
+
+    def setup_event_attribute(self, trace_set, trace_point):
+        id_path = os.path.join(PATH_DEBUGFS_TRACING, 'events', trace_set,
+                               trace_point, 'id')
+
+        event_attr = perf_event_attr()
+        event_attr.config = int(open(id_path).read())
+        return event_attr
+
+    def setup_event(self, group, trace_cpu, trace_point, trace_filter,
+                    trace_set):
+        event_attr = self.setup_event_attribute(trace_set, trace_point)
+
+        group_leader = -1
+        if group.events:
+            group_leader = group.events[0].fd
+
+        fd = perf_event_open(event_attr, -1, trace_cpu,
+                             group_leader, 0)
+        if fd == -1:
+            err = ctypes.get_errno()
+            raise OSError(err, os.strerror(err),
+                          'while calling sys_perf_event_open().')
+
+        if trace_filter:
+            fcntl.ioctl(fd, ARCH.ioctl_numbers['SET_FILTER'],
+                        trace_filter)
+
+        self.fd = fd
+
+    def enable(self):
+        fcntl.ioctl(self.fd, ARCH.ioctl_numbers['ENABLE'], 0)
+
+    def disable(self):
+        fcntl.ioctl(self.fd, ARCH.ioctl_numbers['DISABLE'], 0)
+
+    def reset(self):
+        fcntl.ioctl(self.fd, ARCH.ioctl_numbers['RESET'], 0)
+
+class TracepointProvider(object):
+    def __init__(self):
+        self.group_leaders = []
+        self.filters = get_filters()
+        self._fields = self.get_available_fields()
+        self.setup_traces()
+        self.fields = self._fields
+
+    def get_available_fields(self):
+        path = os.path.join(PATH_DEBUGFS_TRACING, 'events', 'kvm')
+        fields = walkdir(path)[1]
+        extra = []
+        for field in fields:
+            if field in self.filters:
+                filter_name_, filter_dicts = self.filters[field]
+                for name in filter_dicts:
+                    extra.append(field + '(' + name + ')')
+        fields += extra
+        return fields
+
+    def setup_traces(self):
+        cpus = get_online_cpus()
+
+        # The constant is needed as a buffer for python libs, std
+        # streams and other files that the script opens.
+        newlim = len(cpus) * len(self._fields) + 50
+        try:
+            softlim_, hardlim = resource.getrlimit(resource.RLIMIT_NOFILE)
+
+            if hardlim < newlim:
+                # Now we need CAP_SYS_RESOURCE, to increase the hard limit.
+                resource.setrlimit(resource.RLIMIT_NOFILE, (newlim, newlim))
+            else:
+                # Raising the soft limit is sufficient.
+                resource.setrlimit(resource.RLIMIT_NOFILE, (newlim, hardlim))
+
+        except ValueError:
+            sys.exit("NOFILE rlimit could not be raised to {0}".format(newlim))
+
+        for cpu in cpus:
+            group = Group()
+            for name in self._fields:
+                tracepoint = name
+                tracefilter = None
+                match = re.match(r'(.*)\((.*)\)', name)
+                if match:
+                    tracepoint, sub = match.groups()
+                    tracefilter = ('%s==%d\0' %
+                                   (self.filters[tracepoint][0],
+                                    self.filters[tracepoint][1][sub]))
+
+                group.add_event(Event(name=name,
+                                      group=group,
+                                      trace_cpu=cpu,
+                                      trace_point=tracepoint,
+                                      trace_filter=tracefilter))
+            self.group_leaders.append(group)
+
+    def available_fields(self):
+        return self.get_available_fields()
+
+    @property
+    def fields(self):
+        return self._fields
+
+    @fields.setter
+    def fields(self, fields):
+        self._fields = fields
+        for group in self.group_leaders:
+            for index, event in enumerate(group.events):
+                if event.name in fields:
+                    event.reset()
+                    event.enable()
+                else:
+                    # Do not disable the group leader.
+                    # It would disable all of its events.
+                    if index != 0:
+                        event.disable()
+
+    def read(self):
+        ret = defaultdict(int)
+        for group in self.group_leaders:
+            for name, val in group.read().iteritems():
+                if name in self._fields:
+                    ret[name] += val
+        return ret
+
+class DebugfsProvider(object):
+    def __init__(self):
+        self._fields = self.get_available_fields()
+
+    def get_available_fields(self):
+        return walkdir(PATH_DEBUGFS_KVM)[2]
+
+    @property
+    def fields(self):
+        return self._fields
+
+    @fields.setter
+    def fields(self, fields):
+        self._fields = fields
+
+    def read(self):
+        def val(key):
+            return int(file(PATH_DEBUGFS_KVM + '/' + key).read())
+        return dict([(key, val(key)) for key in self._fields])
+
+class Stats(object):
+    def __init__(self, providers, fields=None):
+        self.providers = providers
+        self._fields_filter = fields
+        self.values = {}
+        self.update_provider_filters()
+
+    def update_provider_filters(self):
+        def wanted(key):
+            if not self._fields_filter:
+                return True
+            return re.match(self._fields_filter, key) is not None
+
+        # As we reset the counters when updating the fields we can
+        # also clear the cache of old values.
+        self.values = {}
+        for provider in self.providers:
+            provider_fields = [key for key in provider.get_available_fields()
+                               if wanted(key)]
+            provider.fields = provider_fields
+
+    @property
+    def fields_filter(self):
+        return self._fields_filter
+
+    @fields_filter.setter
+    def fields_filter(self, fields_filter):
+        self._fields_filter = fields_filter
+        self.update_provider_filters()
+
+    def get(self):
+        for provider in self.providers:
+            new = provider.read()
+            for key in provider.fields:
+                oldval = self.values.get(key, (0, 0))
+                newval = new.get(key, 0)
+                newdelta = None
+                if oldval is not None:
+                    newdelta = newval - oldval[0]
+                self.values[key] = (newval, newdelta)
+        return self.values
+
+LABEL_WIDTH = 40
+NUMBER_WIDTH = 10
+
+class Tui(object):
+    def __init__(self, stats):
+        self.stats = stats
+        self.screen = None
+        self.drilldown = False
+        self.update_drilldown()
+
+    def __enter__(self):
+        """Initialises curses for later use.  Based on curses.wrapper
+           implementation from the Python standard library."""
+        self.screen = curses.initscr()
+        curses.noecho()
+        curses.cbreak()
+
+        # The try/catch works around a minor bit of
+        # over-conscientiousness in the curses module, the error
+        # return from C start_color() is ignorable.
+        try:
+            curses.start_color()
+        except:
+            pass
+
+        curses.use_default_colors()
+        return self
+
+    def __exit__(self, *exception):
+        """Resets the terminal to its normal state.  Based on curses.wrappre
+           implementation from the Python standard library."""
+        if self.screen:
+            self.screen.keypad(0)
+            curses.echo()
+            curses.nocbreak()
+            curses.endwin()
+
+    def update_drilldown(self):
+        if not self.stats.fields_filter:
+            self.stats.fields_filter = r'^[^\(]*$'
+
+        elif self.stats.fields_filter == r'^[^\(]*$':
+            self.stats.fields_filter = None
+
+    def refresh(self, sleeptime):
+        self.screen.erase()
+        self.screen.addstr(0, 0, 'kvm statistics - summary', curses.A_BOLD)
+        self.screen.addstr(2, 1, 'Event')
+        self.screen.addstr(2, 1 + LABEL_WIDTH + NUMBER_WIDTH -
+                           len('Total'), 'Total')
+        self.screen.addstr(2, 1 + LABEL_WIDTH + NUMBER_WIDTH + 8 -
+                           len('Current'), 'Current')
+        row = 3
+        stats = self.stats.get()
+        def sortkey(x):
+            if stats[x][1]:
+                return (-stats[x][1], -stats[x][0])
+            else:
+                return (0, -stats[x][0])
+        for key in sorted(stats.keys(), key=sortkey):
+
+            if row >= self.screen.getmaxyx()[0]:
+                break
+            values = stats[key]
+            if not values[0] and not values[1]:
+                break
+            col = 1
+            self.screen.addstr(row, col, key)
+            col += LABEL_WIDTH
+            self.screen.addstr(row, col, '%10d' % (values[0],))
+            col += NUMBER_WIDTH
+            if values[1] is not None:
+                self.screen.addstr(row, col, '%8d' % (values[1] / sleeptime,))
+            row += 1
+        self.screen.refresh()
+
+    def show_filter_selection(self):
+        while True:
+            self.screen.erase()
+            self.screen.addstr(0, 0,
+                               "Show statistics for events matching a regex.",
+                               curses.A_BOLD)
+            self.screen.addstr(2, 0,
+                               "Current regex: {0}"
+                               .format(self.stats.fields_filter))
+            self.screen.addstr(3, 0, "New regex: ")
+            curses.echo()
+            regex = self.screen.getstr()
+            curses.noecho()
+            if len(regex) == 0:
+                return
+            try:
+                re.compile(regex)
+                self.stats.fields_filter = regex
+                return
+            except re.error:
+                continue
+
+    def show_stats(self):
+        sleeptime = 0.25
+        while True:
+            self.refresh(sleeptime)
+            curses.halfdelay(int(sleeptime * 10))
+            sleeptime = 3
+            try:
+                char = self.screen.getkey()
+                if char == 'x':
+                    self.drilldown = not self.drilldown
+                    self.update_drilldown()
+                if char == 'q':
+                    break
+                if char == 'f':
+                    self.show_filter_selection()
+            except KeyboardInterrupt:
+                break
+            except curses.error:
+                continue
+
+def batch(stats):
+    s = stats.get()
+    time.sleep(1)
+    s = stats.get()
+    for key in sorted(s.keys()):
+        values = s[key]
+        print '%-42s%10d%10d' % (key, values[0], values[1])
+
+def log(stats):
+    keys = sorted(stats.get().iterkeys())
+    def banner():
+        for k in keys:
+            print '%s' % k,
+        print
+    def statline():
+        s = stats.get()
+        for k in keys:
+            print ' %9d' % s[k][1],
+        print
+    line = 0
+    banner_repeat = 20
+    while True:
+        time.sleep(1)
+        if line % banner_repeat == 0:
+            banner()
+        statline()
+        line += 1
+
+def get_options():
+    description_text = """
+This script displays various statistics about VMs running under KVM.
+The statistics are gathered from the KVM debugfs entries and / or the
+currently available perf traces.
+
+The monitoring takes additional cpu cycles and might affect the VM's
+performance.
+
+Requirements:
+- Access to:
+    /sys/kernel/debug/kvm
+    /sys/kernel/debug/trace/events/*
+    /proc/pid/task
+- /proc/sys/kernel/perf_event_paranoid < 1 if user has no
+  CAP_SYS_ADMIN and perf events are used.
+- CAP_SYS_RESOURCE if the hard limit is not high enough to allow
+  the large number of files that are possibly opened.
+"""
+
+    class PlainHelpFormatter(optparse.IndentedHelpFormatter):
+        def format_description(self, description):
+            if description:
+                return description + "\n"
+            else:
+                return ""
+
+    optparser = optparse.OptionParser(description=description_text,
+                                      formatter=PlainHelpFormatter())
+    optparser.add_option('-1', '--once', '--batch',
+                         action='store_true',
+                         default=False,
+                         dest='once',
+                         help='run in batch mode for one second',
+                         )
+    optparser.add_option('-l', '--log',
+                         action='store_true',
+                         default=False,
+                         dest='log',
+                         help='run in logging mode (like vmstat)',
+                         )
+    optparser.add_option('-t', '--tracepoints',
+                         action='store_true',
+                         default=False,
+                         dest='tracepoints',
+                         help='retrieve statistics from tracepoints',
+                         )
+    optparser.add_option('-d', '--debugfs',
+                         action='store_true',
+                         default=False,
+                         dest='debugfs',
+                         help='retrieve statistics from debugfs',
+                         )
+    optparser.add_option('-f', '--fields',
+                         action='store',
+                         default=None,
+                         dest='fields',
+                         help='fields to display (regex)',
+                         )
+    (options, _) = optparser.parse_args(sys.argv)
+    return options
+
+def get_providers(options):
+    providers = []
+
+    if options.tracepoints:
+        providers.append(TracepointProvider())
+    if options.debugfs:
+        providers.append(DebugfsProvider())
+    if len(providers) == 0:
+        providers.append(TracepointProvider())
+
+    return providers
+
+def check_access(options):
+    if not os.path.exists('/sys/kernel/debug'):
+        sys.stderr.write('Please enable CONFIG_DEBUG_FS in your kernel.')
+        sys.exit(1)
+
+    if not os.path.exists(PATH_DEBUGFS_KVM):
+        sys.stderr.write("Please make sure, that debugfs is mounted and "
+                         "readable by the current user:\n"
+                         "('mount -t debugfs debugfs /sys/kernel/debug')\n"
+                         "Also ensure, that the kvm modules are loaded.\n")
+        sys.exit(1)
+
+    if not os.path.exists(PATH_DEBUGFS_TRACING) and (options.tracepoints
+                                                     or not options.debugfs):
+        sys.stderr.write("Please enable CONFIG_TRACING in your kernel "
+                         "when using the option -t (default).\n"
+                         "If it is enabled, make {0} readable by the "
+                         "current user.\n"
+                         .format(PATH_DEBUGFS_TRACING))
+        if options.tracepoints:
+            sys.exit(1)
+
+        sys.stderr.write("Falling back to debugfs statistics!\n")
+        options.debugfs = True
+        sleep(5)
+
+    return options
+
+def main():
+    options = get_options()
+    options = check_access(options)
+    providers = get_providers(options)
+    stats = Stats(providers, fields=options.fields)
+
+    if options.log:
+        log(stats)
+    elif not options.once:
+        with Tui(stats) as tui:
+            tui.show_stats()
+    else:
+        batch(stats)
+
+if __name__ == "__main__":
+    main()
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/5] MAINTAINERS: Add kvm tools
  2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 1/5] tools: Add kvm_stat vm monitor script Janosch Frank
@ 2016-05-18 11:26 ` Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 3/5] KVM: Create debugfs dir and stat files for each VM Janosch Frank
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Janosch Frank @ 2016-05-18 11:26 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, dan.carpenter, frankja

The new kvm subdirectory in tools contains kvm related scripts.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9c567a4..ab7f516 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6267,6 +6267,7 @@ F:	arch/*/include/asm/kvm*
 F:	include/linux/kvm*
 F:	include/uapi/linux/kvm*
 F:	virt/kvm/
+F:	tools/kvm/
 
 KERNEL VIRTUAL MACHINE (KVM) FOR AMD-V
 M:	Joerg Roedel <joro@8bytes.org>
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/5] KVM: Create debugfs dir and stat files for each VM
  2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 1/5] tools: Add kvm_stat vm monitor script Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 2/5] MAINTAINERS: Add kvm tools Janosch Frank
@ 2016-05-18 11:26 ` Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 4/5] tools: kvm_stat: Introduce pid monitoring Janosch Frank
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Janosch Frank @ 2016-05-18 11:26 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, dan.carpenter, frankja

This patch adds a kvm debugfs subdirectory for each VM, which is named
after its pid and file descriptor. The directories contain the same
kind of files that are already in the kvm debugfs directory, but the
data exported through them is now VM specific.

This makes the debugfs kvm data a convenient alternative to the
tracepoints which already have per VM data. The debugfs data is easy
to read and low overhead.

CC: Dan Carpenter <dan.carpenter@oracle.com> [includes fixes by Dan Carpenter]
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
---
 include/linux/kvm_host.h |   7 ++
 virt/kvm/kvm_main.c      | 187 ++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 184 insertions(+), 10 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a862176..08b8128 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -408,6 +408,8 @@ struct kvm {
 #endif
 	long tlbs_dirty;
 	struct list_head devices;
+	struct dentry *debugfs_dentry;
+	struct kvm_stat_data **debugfs_stat_data;
 };
 
 #define kvm_err(fmt, ...) \
@@ -985,6 +987,11 @@ enum kvm_stat_kind {
 	KVM_STAT_VCPU,
 };
 
+struct kvm_stat_data {
+	int offset;
+	struct kvm *kvm;
+};
+
 struct kvm_stats_debugfs_item {
 	const char *name;
 	int offset;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fbd7698..5df6864 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -63,6 +63,9 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/kvm.h>
 
+/* Worst case buffer size needed for holding an integer. */
+#define ITOA_MAX_LEN 12
+
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
@@ -100,6 +103,9 @@ static __read_mostly struct preempt_ops kvm_preempt_ops;
 struct dentry *kvm_debugfs_dir;
 EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
 
+static int kvm_debugfs_num_entries;
+static const struct file_operations *stat_fops_per_vm[];
+
 static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 			   unsigned long arg);
 #ifdef CONFIG_KVM_COMPAT
@@ -542,6 +548,58 @@ static void kvm_free_memslots(struct kvm *kvm, struct kvm_memslots *slots)
 	kvfree(slots);
 }
 
+static void kvm_destroy_vm_debugfs(struct kvm *kvm)
+{
+	int i;
+
+	if (!kvm->debugfs_dentry)
+		return;
+
+	debugfs_remove_recursive(kvm->debugfs_dentry);
+
+	for (i = 0; i < kvm_debugfs_num_entries; i++)
+		kfree(kvm->debugfs_stat_data[i]);
+	kfree(kvm->debugfs_stat_data);
+}
+
+static int kvm_create_vm_debugfs(struct kvm *kvm, int fd)
+{
+	char dir_name[ITOA_MAX_LEN * 2];
+	struct kvm_stat_data *stat_data;
+	struct kvm_stats_debugfs_item *p;
+
+	if (!debugfs_initialized())
+		return 0;
+
+	snprintf(dir_name, sizeof(dir_name), "%d-%d", task_pid_nr(current), fd);
+	kvm->debugfs_dentry = debugfs_create_dir(dir_name,
+						 kvm_debugfs_dir);
+	if (!kvm->debugfs_dentry)
+		return -ENOMEM;
+
+	kvm->debugfs_stat_data = kcalloc(kvm_debugfs_num_entries,
+					 sizeof(*kvm->debugfs_stat_data),
+					 GFP_KERNEL);
+	if (!kvm->debugfs_stat_data)
+		return -ENOMEM;
+
+	for (p = debugfs_entries; p->name; p++) {
+		stat_data = kzalloc(sizeof(*stat_data), GFP_KERNEL);
+		if (!stat_data)
+			return -ENOMEM;
+
+		stat_data->kvm = kvm;
+		stat_data->offset = p->offset;
+		kvm->debugfs_stat_data[p - debugfs_entries] = stat_data;
+		if (!debugfs_create_file(p->name, 0444,
+					 kvm->debugfs_dentry,
+					 stat_data,
+					 stat_fops_per_vm[p->kind]))
+			return -ENOMEM;
+	}
+	return 0;
+}
+
 static struct kvm *kvm_create_vm(unsigned long type)
 {
 	int r, i;
@@ -647,6 +705,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
 	int i;
 	struct mm_struct *mm = kvm->mm;
 
+	kvm_destroy_vm_debugfs(kvm);
 	kvm_arch_sync_events(kvm);
 	spin_lock(&kvm_lock);
 	list_del(&kvm->vm_list);
@@ -2989,8 +3048,15 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
 	}
 #endif
 	r = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC);
-	if (r < 0)
+	if (r < 0) {
 		kvm_put_kvm(kvm);
+		return r;
+	}
+
+	if (kvm_create_vm_debugfs(kvm, r) < 0) {
+		kvm_put_kvm(kvm);
+		return -ENOMEM;
+	}
 
 	return r;
 }
@@ -3415,15 +3481,114 @@ static struct notifier_block kvm_cpu_notifier = {
 	.notifier_call = kvm_cpu_hotplug,
 };
 
+static int kvm_debugfs_open(struct inode *inode, struct file *file,
+			   int (*get)(void *, u64 *), int (*set)(void *, u64),
+			   const char *fmt)
+{
+	struct kvm_stat_data *stat_data = (struct kvm_stat_data *)
+					  inode->i_private;
+
+	/* The debugfs files are a reference to the kvm struct which
+	 * is still valid when kvm_destroy_vm is called.
+	 * To avoid the race between open and the removal of the debugfs
+	 * directory we test against the users count.
+	 */
+	if (!atomic_add_unless(&stat_data->kvm->users_count, 1, 0))
+		return -ENOENT;
+
+	if (simple_attr_open(inode, file, get, set, fmt)) {
+		kvm_put_kvm(stat_data->kvm);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int kvm_debugfs_release(struct inode *inode, struct file *file)
+{
+	struct kvm_stat_data *stat_data = (struct kvm_stat_data *)
+					  inode->i_private;
+
+	simple_attr_release(inode, file);
+	kvm_put_kvm(stat_data->kvm);
+
+	return 0;
+}
+
+static int vm_stat_get_per_vm(void *data, u64 *val)
+{
+	struct kvm_stat_data *stat_data = (struct kvm_stat_data *)data;
+
+	*val = *(u32 *)((void *)stat_data->kvm + stat_data->offset);
+
+	return 0;
+}
+
+static int vm_stat_get_per_vm_open(struct inode *inode, struct file *file)
+{
+	__simple_attr_check_format("%llu\n", 0ull);
+	return kvm_debugfs_open(inode, file, vm_stat_get_per_vm,
+				NULL, "%llu\n");
+}
+
+static const struct file_operations vm_stat_get_per_vm_fops = {
+	.owner   = THIS_MODULE,
+	.open    = vm_stat_get_per_vm_open,
+	.release = kvm_debugfs_release,
+	.read    = simple_attr_read,
+	.write   = simple_attr_write,
+	.llseek  = generic_file_llseek,
+};
+
+static int vcpu_stat_get_per_vm(void *data, u64 *val)
+{
+	int i;
+	struct kvm_stat_data *stat_data = (struct kvm_stat_data *)data;
+	struct kvm_vcpu *vcpu;
+
+	*val = 0;
+
+	kvm_for_each_vcpu(i, vcpu, stat_data->kvm)
+		*val += *(u32 *)((void *)vcpu + stat_data->offset);
+
+	return 0;
+}
+
+static int vcpu_stat_get_per_vm_open(struct inode *inode, struct file *file)
+{
+	__simple_attr_check_format("%llu\n", 0ull);
+	return kvm_debugfs_open(inode, file, vcpu_stat_get_per_vm,
+				 NULL, "%llu\n");
+}
+
+static const struct file_operations vcpu_stat_get_per_vm_fops = {
+	.owner   = THIS_MODULE,
+	.open    = vcpu_stat_get_per_vm_open,
+	.release = kvm_debugfs_release,
+	.read    = simple_attr_read,
+	.write   = simple_attr_write,
+	.llseek  = generic_file_llseek,
+};
+
+static const struct file_operations *stat_fops_per_vm[] = {
+	[KVM_STAT_VCPU] = &vcpu_stat_get_per_vm_fops,
+	[KVM_STAT_VM]   = &vm_stat_get_per_vm_fops,
+};
+
 static int vm_stat_get(void *_offset, u64 *val)
 {
 	unsigned offset = (long)_offset;
 	struct kvm *kvm;
+	struct kvm_stat_data stat_tmp = {.offset = offset};
+	u64 tmp_val;
 
 	*val = 0;
 	spin_lock(&kvm_lock);
-	list_for_each_entry(kvm, &vm_list, vm_list)
-		*val += *(u32 *)((void *)kvm + offset);
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		stat_tmp.kvm = kvm;
+		vm_stat_get_per_vm((void *)&stat_tmp, &tmp_val);
+		*val += tmp_val;
+	}
 	spin_unlock(&kvm_lock);
 	return 0;
 }
@@ -3434,15 +3599,16 @@ static int vcpu_stat_get(void *_offset, u64 *val)
 {
 	unsigned offset = (long)_offset;
 	struct kvm *kvm;
-	struct kvm_vcpu *vcpu;
-	int i;
+	struct kvm_stat_data stat_tmp = {.offset = offset};
+	u64 tmp_val;
 
 	*val = 0;
 	spin_lock(&kvm_lock);
-	list_for_each_entry(kvm, &vm_list, vm_list)
-		kvm_for_each_vcpu(i, vcpu, kvm)
-			*val += *(u32 *)((void *)vcpu + offset);
-
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		stat_tmp.kvm = kvm;
+		vcpu_stat_get_per_vm((void *)&stat_tmp, &tmp_val);
+		*val += tmp_val;
+	}
 	spin_unlock(&kvm_lock);
 	return 0;
 }
@@ -3463,7 +3629,8 @@ static int kvm_init_debug(void)
 	if (kvm_debugfs_dir == NULL)
 		goto out;
 
-	for (p = debugfs_entries; p->name; ++p) {
+	kvm_debugfs_num_entries = 0;
+	for (p = debugfs_entries; p->name; ++p, kvm_debugfs_num_entries++) {
 		if (!debugfs_create_file(p->name, 0444, kvm_debugfs_dir,
 					 (void *)(long)p->offset,
 					 stat_fops[p->kind]))
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/5] tools: kvm_stat: Introduce pid monitoring
  2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
                   ` (2 preceding siblings ...)
  2016-05-18 11:26 ` [PATCH v2 3/5] KVM: Create debugfs dir and stat files for each VM Janosch Frank
@ 2016-05-18 11:26 ` Janosch Frank
  2016-05-18 11:26 ` [PATCH v2 5/5] tools: kvm_stat: Add documentation Janosch Frank
  2016-05-23 14:07 ` [PATCH v2 0/5] Improve KVM per VM monitoring Paolo Bonzini
  5 siblings, 0 replies; 11+ messages in thread
From: Janosch Frank @ 2016-05-18 11:26 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, dan.carpenter, frankja

Having stats for single VMs can help to determine the problem of a VM
without the need of running other tools like perf.

The tracepoints already allowed pid level monitoring, but kvm_stat
didn't have support for it till now. Support for the newly implemented
debugfs vm monitoring was also implemented.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
---
 tools/kvm/kvm_stat/kvm_stat | 183 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 163 insertions(+), 20 deletions(-)

diff --git a/tools/kvm/kvm_stat/kvm_stat b/tools/kvm/kvm_stat/kvm_stat
index 769d884..a4643f5 100755
--- a/tools/kvm/kvm_stat/kvm_stat
+++ b/tools/kvm/kvm_stat/kvm_stat
@@ -365,12 +365,16 @@ class Group(object):
                                       os.read(self.events[0].fd, length))))
 
 class Event(object):
-    def __init__(self, name, group, trace_cpu, trace_point, trace_filter,
-                 trace_set='kvm'):
+    def __init__(self, name, group, trace_cpu, trace_pid, trace_point,
+                 trace_filter, trace_set='kvm'):
         self.name = name
         self.fd = None
-        self.setup_event(group, trace_cpu, trace_point, trace_filter,
-                         trace_set)
+        self.setup_event(group, trace_cpu, trace_pid, trace_point,
+                         trace_filter, trace_set)
+
+    def __del__(self):
+        if self.fd:
+            os.close(self.fd)
 
     def setup_event_attribute(self, trace_set, trace_point):
         id_path = os.path.join(PATH_DEBUGFS_TRACING, 'events', trace_set,
@@ -380,16 +384,16 @@ class Event(object):
         event_attr.config = int(open(id_path).read())
         return event_attr
 
-    def setup_event(self, group, trace_cpu, trace_point, trace_filter,
-                    trace_set):
+    def setup_event(self, group, trace_cpu, trace_pid, trace_point,
+                    trace_filter, trace_set):
         event_attr = self.setup_event_attribute(trace_set, trace_point)
 
         group_leader = -1
         if group.events:
             group_leader = group.events[0].fd
 
-        fd = perf_event_open(event_attr, -1, trace_cpu,
-                             group_leader, 0)
+        fd = perf_event_open(event_attr, trace_pid,
+                             trace_cpu, group_leader, 0)
         if fd == -1:
             err = ctypes.get_errno()
             raise OSError(err, os.strerror(err),
@@ -415,8 +419,7 @@ class TracepointProvider(object):
         self.group_leaders = []
         self.filters = get_filters()
         self._fields = self.get_available_fields()
-        self.setup_traces()
-        self.fields = self._fields
+        self._pid = 0
 
     def get_available_fields(self):
         path = os.path.join(PATH_DEBUGFS_TRACING, 'events', 'kvm')
@@ -431,11 +434,17 @@ class TracepointProvider(object):
         return fields
 
     def setup_traces(self):
-        cpus = get_online_cpus()
+        if self._pid > 0:
+            # Fetch list of all threads of the monitored pid, as qemu
+            # starts a thread for each vcpu.
+            path = os.path.join('/proc', str(self._pid), 'task')
+            groupids = walkdir(path)[1]
+        else:
+            groupids = get_online_cpus()
 
         # The constant is needed as a buffer for python libs, std
         # streams and other files that the script opens.
-        newlim = len(cpus) * len(self._fields) + 50
+        newlim = len(groupids) * len(self._fields) + 50
         try:
             softlim_, hardlim = resource.getrlimit(resource.RLIMIT_NOFILE)
 
@@ -449,7 +458,7 @@ class TracepointProvider(object):
         except ValueError:
             sys.exit("NOFILE rlimit could not be raised to {0}".format(newlim))
 
-        for cpu in cpus:
+        for groupid in groupids:
             group = Group()
             for name in self._fields:
                 tracepoint = name
@@ -461,11 +470,22 @@ class TracepointProvider(object):
                                    (self.filters[tracepoint][0],
                                     self.filters[tracepoint][1][sub]))
 
+                # From perf_event_open(2):
+                # pid > 0 and cpu == -1
+                # This measures the specified process/thread on any CPU.
+                #
+                # pid == -1 and cpu >= 0
+                # This measures all processes/threads on the specified CPU.
+                trace_cpu = groupid if self._pid == 0 else -1
+                trace_pid = int(groupid) if self._pid != 0 else -1
+
                 group.add_event(Event(name=name,
                                       group=group,
-                                      trace_cpu=cpu,
+                                      trace_cpu=trace_cpu,
+                                      trace_pid=trace_pid,
                                       trace_point=tracepoint,
                                       trace_filter=tracefilter))
+
             self.group_leaders.append(group)
 
     def available_fields(self):
@@ -489,6 +509,17 @@ class TracepointProvider(object):
                     if index != 0:
                         event.disable()
 
+    @property
+    def pid(self):
+        return self._pid
+
+    @pid.setter
+    def pid(self, pid):
+        self._pid = pid
+        self.group_leaders = []
+        self.setup_traces()
+        self.fields = self._fields
+
     def read(self):
         ret = defaultdict(int)
         for group in self.group_leaders:
@@ -500,6 +531,8 @@ class TracepointProvider(object):
 class DebugfsProvider(object):
     def __init__(self):
         self._fields = self.get_available_fields()
+        self._pid = 0
+        self.do_read = True
 
     def get_available_fields(self):
         return walkdir(PATH_DEBUGFS_KVM)[2]
@@ -512,16 +545,57 @@ class DebugfsProvider(object):
     def fields(self, fields):
         self._fields = fields
 
+    @property
+    def pid(self):
+        return self._pid
+
+    @pid.setter
+    def pid(self, pid):
+        if pid != 0:
+            self._pid = pid
+
+            vms = walkdir(PATH_DEBUGFS_KVM)[1]
+            if len(vms) == 0:
+                self.do_read = False
+
+            self.paths = filter(lambda x: "{}-".format(pid) in x, vms)
+
+        else:
+            self.paths = ['']
+            self.do_read = True
+
     def read(self):
-        def val(key):
-            return int(file(PATH_DEBUGFS_KVM + '/' + key).read())
-        return dict([(key, val(key)) for key in self._fields])
+        """Returns a dict with format:'file name / field -> current value'."""
+        results = {}
+
+        # If no debugfs filtering support is available, then don't read.
+        if not self.do_read:
+            return results
+
+        for path in self.paths:
+            for field in self._fields:
+                results[field] = results.get(field, 0) \
+                                 + self.read_field(field, path)
+
+        return results
+
+    def read_field(self, field, path):
+        """Returns the value of a single field from a specific VM."""
+        try:
+            return int(open(os.path.join(PATH_DEBUGFS_KVM,
+                                         path,
+                                         field))
+                       .read())
+        except IOError:
+            return 0
 
 class Stats(object):
-    def __init__(self, providers, fields=None):
+    def __init__(self, providers, pid, fields=None):
         self.providers = providers
+        self._pid_filter = pid
         self._fields_filter = fields
         self.values = {}
+        self.update_provider_pid()
         self.update_provider_filters()
 
     def update_provider_filters(self):
@@ -538,6 +612,10 @@ class Stats(object):
                                if wanted(key)]
             provider.fields = provider_fields
 
+    def update_provider_pid(self):
+        for provider in self.providers:
+            provider.pid = self._pid_filter
+
     @property
     def fields_filter(self):
         return self._fields_filter
@@ -547,6 +625,16 @@ class Stats(object):
         self._fields_filter = fields_filter
         self.update_provider_filters()
 
+    @property
+    def pid_filter(self):
+        return self._pid_filter
+
+    @pid_filter.setter
+    def pid_filter(self, pid):
+        self._pid_filter = pid
+        self.values = {}
+        self.update_provider_pid()
+
     def get(self):
         for provider in self.providers:
             new = provider.read()
@@ -603,9 +691,17 @@ class Tui(object):
         elif self.stats.fields_filter == r'^[^\(]*$':
             self.stats.fields_filter = None
 
+    def update_pid(self, pid):
+        self.stats.pid_filter = pid
+
     def refresh(self, sleeptime):
         self.screen.erase()
-        self.screen.addstr(0, 0, 'kvm statistics - summary', curses.A_BOLD)
+        if self.stats.pid_filter > 0:
+            self.screen.addstr(0, 0, 'kvm statistics - pid {0}'
+                               .format(self.stats.pid_filter),
+                               curses.A_BOLD)
+        else:
+            self.screen.addstr(0, 0, 'kvm statistics - summary', curses.A_BOLD)
         self.screen.addstr(2, 1, 'Event')
         self.screen.addstr(2, 1 + LABEL_WIDTH + NUMBER_WIDTH -
                            len('Total'), 'Total')
@@ -657,6 +753,37 @@ class Tui(object):
             except re.error:
                 continue
 
+    def show_vm_selection(self):
+        while True:
+            self.screen.erase()
+            self.screen.addstr(0, 0,
+                               'Show statistics for specific pid.',
+                               curses.A_BOLD)
+            self.screen.addstr(1, 0,
+                               'This might limit the shown data to the trace '
+                               'statistics.')
+
+            curses.echo()
+            self.screen.addstr(3, 0, "Pid [0 or pid]: ")
+            pid = self.screen.getstr()
+            curses.noecho()
+
+            try:
+                pid = int(pid)
+
+                if pid == 0:
+                    self.update_pid(pid)
+                    break
+                else:
+                    if not os.path.isdir(os.path.join('/proc/', str(pid))):
+                        continue
+                    else:
+                        self.update_pid(pid)
+                        break
+
+            except ValueError:
+                continue
+
     def show_stats(self):
         sleeptime = 0.25
         while True:
@@ -672,6 +799,8 @@ class Tui(object):
                     break
                 if char == 'f':
                     self.show_filter_selection()
+                if char == 'p':
+                    self.show_vm_selection()
             except KeyboardInterrupt:
                 break
             except curses.error:
@@ -764,6 +893,13 @@ Requirements:
                          dest='fields',
                          help='fields to display (regex)',
                          )
+    optparser.add_option('-p', '--pid',
+                        action='store',
+                        default=0,
+                        type=int,
+                        dest='pid',
+                        help='restrict statistics to pid',
+                        )
     (options, _) = optparser.parse_args(sys.argv)
     return options
 
@@ -810,8 +946,15 @@ def check_access(options):
 def main():
     options = get_options()
     options = check_access(options)
+
+    if (options.pid > 0 and
+        not os.path.isdir(os.path.join('/proc/',
+                                       str(options.pid)))):
+        sys.stderr.write('Did you use a (unsupported) tid instead of a pid?\n')
+        sys.exit('Specified pid does not exist.')
+
     providers = get_providers(options)
-    stats = Stats(providers, fields=options.fields)
+    stats = Stats(providers, options.pid, fields=options.fields)
 
     if options.log:
         log(stats)
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 5/5] tools: kvm_stat: Add documentation
  2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
                   ` (3 preceding siblings ...)
  2016-05-18 11:26 ` [PATCH v2 4/5] tools: kvm_stat: Introduce pid monitoring Janosch Frank
@ 2016-05-18 11:26 ` Janosch Frank
  2016-05-23 14:07 ` [PATCH v2 0/5] Improve KVM per VM monitoring Paolo Bonzini
  5 siblings, 0 replies; 11+ messages in thread
From: Janosch Frank @ 2016-05-18 11:26 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, dan.carpenter, frankja

A lot of the code works with the perf events about which only sparse
documentation was available until 2012. Having that information now,
we can clarify what is done in the code.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
---
 tools/kvm/kvm_stat/kvm_stat | 161 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 159 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/kvm_stat/kvm_stat b/tools/kvm/kvm_stat/kvm_stat
index a4643f5..817f521 100755
--- a/tools/kvm/kvm_stat/kvm_stat
+++ b/tools/kvm/kvm_stat/kvm_stat
@@ -10,6 +10,15 @@
 #
 # This work is licensed under the terms of the GNU GPL, version 2.  See
 # the COPYING file in the top-level directory.
+"""The kvm_stat module outputs statistics about running KVM VMs
+
+Three different ways of output formatting are available:
+- as a top-like text ui
+- in a key -> value format
+- in an all keys, all values format
+
+The data is sampled from the KVM's debugfs entries and its perf events.
+"""
 
 import curses
 import sys
@@ -217,8 +226,10 @@ IOCTL_NUMBERS = {
 }
 
 class Arch(object):
-    """Class that encapsulates global architecture specific data like
-    syscall and ioctl numbers.
+    """Encapsulates global architecture specific data.
+
+    Contains the performance event open syscall and ioctl numbers, as
+    well as the VM exit reasons for the architecture it runs on.
 
     """
     @staticmethod
@@ -304,12 +315,22 @@ def parse_int_list(list_string):
 
 
 def get_online_cpus():
+    """Returns a list of cpu id integers."""
     with open('/sys/devices/system/cpu/online') as cpu_list:
         cpu_string = cpu_list.readline()
         return parse_int_list(cpu_string)
 
 
 def get_filters():
+    """Returns a dict of trace events, their filter ids and
+    the values that can be filtered.
+
+    Trace events can be filtered for special values by setting a
+    filter string via an ioctl. The string normally has the format
+    identifier==value. For each filter a new event will be created, to
+    be able to distinguish the events.
+
+    """
     filters = {}
     filters['kvm_userspace_exit'] = ('reason', USERSPACE_EXIT_REASONS)
     if ARCH.exit_reasons:
@@ -320,6 +341,14 @@ libc = ctypes.CDLL('libc.so.6', use_errno=True)
 syscall = libc.syscall
 
 class perf_event_attr(ctypes.Structure):
+    """Struct that holds the necessary data to set up a trace event.
+
+    For an extensive explanation see perf_event_open(2) and
+    include/uapi/linux/perf_event.h, struct perf_event_attr
+
+    All fields that are not initialized in the constructor are 0.
+
+    """
     _fields_ = [('type', ctypes.c_uint32),
                 ('size', ctypes.c_uint32),
                 ('config', ctypes.c_uint64),
@@ -340,6 +369,20 @@ class perf_event_attr(ctypes.Structure):
         self.read_format = PERF_FORMAT_GROUP
 
 def perf_event_open(attr, pid, cpu, group_fd, flags):
+    """Wrapper for the sys_perf_evt_open() syscall.
+
+    Used to set up performance events, returns a file descriptor or -1
+    on error.
+
+    Attributes are:
+    - syscall number
+    - struct perf_event_attr *
+    - pid or -1 to monitor all pids
+    - cpu number or -1 to monitor all cpus
+    - The file descriptor of the group leader or -1 to create a group.
+    - flags
+
+    """
     return syscall(ARCH.sc_perf_evt_open, ctypes.pointer(attr),
                    ctypes.c_int(pid), ctypes.c_int(cpu),
                    ctypes.c_int(group_fd), ctypes.c_long(flags))
@@ -351,6 +394,8 @@ PATH_DEBUGFS_TRACING = '/sys/kernel/debug/tracing'
 PATH_DEBUGFS_KVM = '/sys/kernel/debug/kvm'
 
 class Group(object):
+    """Represents a perf event group."""
+
     def __init__(self):
         self.events = []
 
@@ -358,6 +403,22 @@ class Group(object):
         self.events.append(event)
 
     def read(self):
+        """Returns a dict with 'event name: value' for all events in the
+        group.
+
+        Values are read by reading from the file descriptor of the
+        event that is the group leader. See perf_event_open(2) for
+        details.
+
+        Read format for the used event configuration is:
+        struct read_format {
+            u64 nr; /* The number of events */
+            struct {
+                u64 value; /* The value of the event */
+            } values[nr];
+        };
+
+        """
         length = 8 * (1 + len(self.events))
         read_format = 'xxxxxxxx' + 'Q' * len(self.events)
         return dict(zip([event.name for event in self.events],
@@ -365,6 +426,7 @@ class Group(object):
                                       os.read(self.events[0].fd, length))))
 
 class Event(object):
+    """Represents a performance event and manages its life cycle."""
     def __init__(self, name, group, trace_cpu, trace_pid, trace_point,
                  trace_filter, trace_set='kvm'):
         self.name = name
@@ -373,10 +435,19 @@ class Event(object):
                          trace_filter, trace_set)
 
     def __del__(self):
+        """Closes the event's file descriptor.
+
+        As no python file object was created for the file descriptor,
+        python will not reference count the descriptor and will not
+        close it itself automatically, so we do it.
+
+        """
         if self.fd:
             os.close(self.fd)
 
     def setup_event_attribute(self, trace_set, trace_point):
+        """Returns an initialized ctype perf_event_attr struct."""
+
         id_path = os.path.join(PATH_DEBUGFS_TRACING, 'events', trace_set,
                                trace_point, 'id')
 
@@ -386,9 +457,19 @@ class Event(object):
 
     def setup_event(self, group, trace_cpu, trace_pid, trace_point,
                     trace_filter, trace_set):
+        """Sets up the perf event in Linux.
+
+        Issues the syscall to register the event in the kernel and
+        then sets the optional filter.
+
+        """
+
         event_attr = self.setup_event_attribute(trace_set, trace_point)
 
+        # First event will be group leader.
         group_leader = -1
+
+        # All others have to pass the leader's descriptor instead.
         if group.events:
             group_leader = group.events[0].fd
 
@@ -406,15 +487,33 @@ class Event(object):
         self.fd = fd
 
     def enable(self):
+        """Enables the trace event in the kernel.
+
+        Enabling the group leader makes reading counters from it and the
+        events under it possible.
+
+        """
         fcntl.ioctl(self.fd, ARCH.ioctl_numbers['ENABLE'], 0)
 
     def disable(self):
+        """Disables the trace event in the kernel.
+
+        Disabling the group leader makes reading all counters under it
+        impossible.
+
+        """
         fcntl.ioctl(self.fd, ARCH.ioctl_numbers['DISABLE'], 0)
 
     def reset(self):
+        """Resets the count of the trace event in the kernel."""
         fcntl.ioctl(self.fd, ARCH.ioctl_numbers['RESET'], 0)
 
 class TracepointProvider(object):
+    """Data provider for the stats class.
+
+    Manages the events/groups from which it acquires its data.
+
+    """
     def __init__(self):
         self.group_leaders = []
         self.filters = get_filters()
@@ -422,6 +521,20 @@ class TracepointProvider(object):
         self._pid = 0
 
     def get_available_fields(self):
+        """Returns a list of available event's of format 'event name(filter
+        name)'.
+
+        All available events have directories under
+        /sys/kernel/debug/tracing/events/ which export information
+        about the specific event. Therefore, listing the dirs gives us
+        a list of all available events.
+
+        Some events like the vm exit reasons can be filtered for
+        specific values. To take account for that, the routine below
+        creates special fields with the following format:
+        event name(filter name)
+
+        """
         path = os.path.join(PATH_DEBUGFS_TRACING, 'events', 'kvm')
         fields = walkdir(path)[1]
         extra = []
@@ -434,6 +547,8 @@ class TracepointProvider(object):
         return fields
 
     def setup_traces(self):
+        """Creates all event and group objects needed to be able to retrieve
+        data."""
         if self._pid > 0:
             # Fetch list of all threads of the monitored pid, as qemu
             # starts a thread for each vcpu.
@@ -497,6 +612,7 @@ class TracepointProvider(object):
 
     @fields.setter
     def fields(self, fields):
+        """Enables/disables the (un)wanted events"""
         self._fields = fields
         for group in self.group_leaders:
             for index, event in enumerate(group.events):
@@ -515,12 +631,16 @@ class TracepointProvider(object):
 
     @pid.setter
     def pid(self, pid):
+        """Changes the monitored pid by setting new traces."""
         self._pid = pid
+        # The garbage collector will get rid of all Event/Group
+        # objects and open files after removing the references.
         self.group_leaders = []
         self.setup_traces()
         self.fields = self._fields
 
     def read(self):
+        """Returns 'event name: current value' for all enabled events."""
         ret = defaultdict(int)
         for group in self.group_leaders:
             for name, val in group.read().iteritems():
@@ -529,12 +649,19 @@ class TracepointProvider(object):
         return ret
 
 class DebugfsProvider(object):
+    """Provides data from the files that KVM creates in the kvm debugfs
+    folder."""
     def __init__(self):
         self._fields = self.get_available_fields()
         self._pid = 0
         self.do_read = True
 
     def get_available_fields(self):
+        """"Returns a list of available fields.
+
+        The fields are all available KVM debugfs files
+
+        """
         return walkdir(PATH_DEBUGFS_KVM)[2]
 
     @property
@@ -590,6 +717,12 @@ class DebugfsProvider(object):
             return 0
 
 class Stats(object):
+    """Manages the data providers and the data they provide.
+
+    It is used to set filters on the provider's data and collect all
+    provider data.
+
+    """
     def __init__(self, providers, pid, fields=None):
         self.providers = providers
         self._pid_filter = pid
@@ -599,6 +732,7 @@ class Stats(object):
         self.update_provider_filters()
 
     def update_provider_filters(self):
+        """Propagates fields filters to providers."""
         def wanted(key):
             if not self._fields_filter:
                 return True
@@ -613,6 +747,7 @@ class Stats(object):
             provider.fields = provider_fields
 
     def update_provider_pid(self):
+        """Propagates pid filters to providers."""
         for provider in self.providers:
             provider.pid = self._pid_filter
 
@@ -636,6 +771,8 @@ class Stats(object):
         self.update_provider_pid()
 
     def get(self):
+        """Returns a dict with field -> (value, delta to last value) of all
+        provider data."""
         for provider in self.providers:
             new = provider.read()
             for key in provider.fields:
@@ -651,6 +788,7 @@ LABEL_WIDTH = 40
 NUMBER_WIDTH = 10
 
 class Tui(object):
+    """Instruments curses to draw a nice text ui."""
     def __init__(self, stats):
         self.stats = stats
         self.screen = None
@@ -685,6 +823,7 @@ class Tui(object):
             curses.endwin()
 
     def update_drilldown(self):
+        """Sets or removes a filter that only allows fields without braces."""
         if not self.stats.fields_filter:
             self.stats.fields_filter = r'^[^\(]*$'
 
@@ -692,9 +831,11 @@ class Tui(object):
             self.stats.fields_filter = None
 
     def update_pid(self, pid):
+        """Propagates pid selection to stats object."""
         self.stats.pid_filter = pid
 
     def refresh(self, sleeptime):
+        """Refreshes on-screen data."""
         self.screen.erase()
         if self.stats.pid_filter > 0:
             self.screen.addstr(0, 0, 'kvm statistics - pid {0}'
@@ -732,6 +873,11 @@ class Tui(object):
         self.screen.refresh()
 
     def show_filter_selection(self):
+        """Draws filter selection mask.
+
+        Asks for a valid regex and sets the fields filter accordingly.
+
+        """
         while True:
             self.screen.erase()
             self.screen.addstr(0, 0,
@@ -754,6 +900,11 @@ class Tui(object):
                 continue
 
     def show_vm_selection(self):
+        """Draws PID selection mask.
+
+        Asks for a pid until a valid pid or 0 has been entered.
+
+        """
         while True:
             self.screen.erase()
             self.screen.addstr(0, 0,
@@ -785,6 +936,7 @@ class Tui(object):
                 continue
 
     def show_stats(self):
+        """Refreshes the screen and processes user input."""
         sleeptime = 0.25
         while True:
             self.refresh(sleeptime)
@@ -807,6 +959,7 @@ class Tui(object):
                 continue
 
 def batch(stats):
+    """Prints statistics in a key, value format."""
     s = stats.get()
     time.sleep(1)
     s = stats.get()
@@ -815,6 +968,7 @@ def batch(stats):
         print '%-42s%10d%10d' % (key, values[0], values[1])
 
 def log(stats):
+    """Prints statistics as reiterating key block, multiple value blocks."""
     keys = sorted(stats.get().iterkeys())
     def banner():
         for k in keys:
@@ -835,6 +989,7 @@ def log(stats):
         line += 1
 
 def get_options():
+    """Returns processed program arguments."""
     description_text = """
 This script displays various statistics about VMs running under KVM.
 The statistics are gathered from the KVM debugfs entries and / or the
@@ -904,6 +1059,7 @@ Requirements:
     return options
 
 def get_providers(options):
+    """Returns a list of data providers depending on the passed options."""
     providers = []
 
     if options.tracepoints:
@@ -916,6 +1072,7 @@ def get_providers(options):
     return providers
 
 def check_access(options):
+    """Exits if the current user can't access all needed directories."""
     if not os.path.exists('/sys/kernel/debug'):
         sys.stderr.write('Please enable CONFIG_DEBUG_FS in your kernel.')
         sys.exit(1)
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] Improve KVM per VM monitoring
  2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
                   ` (4 preceding siblings ...)
  2016-05-18 11:26 ` [PATCH v2 5/5] tools: kvm_stat: Add documentation Janosch Frank
@ 2016-05-23 14:07 ` Paolo Bonzini
  2016-05-24  8:17   ` Janosch Frank
  5 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2016-05-23 14:07 UTC (permalink / raw)
  To: Janosch Frank, kvm; +Cc: dan.carpenter



On 18/05/2016 13:26, Janosch Frank wrote:
> This patchset introduces KVM per VM exit statistics monitoring via
> debugfs, as well as moves a tool to display VM statistics from qemu to
> tools/.
> 
> The new debugfs per VM statistics are an alternative to the already
> available VM tracepoints. They are easier to read and have low
> overhead.
> 
> The kvm_stat python script is moved to the kernel, as we can make sure
> here that the right version of the script is used with the right
> kernel version. This is not given for qemu, as it supports a wide
> range of linux kernel versions.

The patches look good, but you are not moving over the documentation.

I started looking at asciidoc, but perf's documentation Makefiles seem
to be overkill for our purposes.  (My prototype conversion of the docs
to asciidoc after my signature).

Perhaps pod2man could be an alternative (using QEMU's texi2pod to
build the kvm_stat.pod input)?

Paolo


kvm_stat(1)
===========

NAME
----
kvm_stat - Report KVM kernel module event counters

SYNOPSIS
--------
[verse]
'kvm_stat' [OPTION]...

DESCRIPTION
-----------
kvm_stat prints counts of KVM kernel module trace events.  These events signify
state transitions such as guest mode entry and exit.

This tool is useful for observing guest behavior from the host perspective.
Often conclusions about performance or buggy behavior can be drawn from the
output.

The set of KVM kernel module trace events may be specific to the kernel version
or architecture.  It is best to check the KVM kernel module source code for the
meaning of events.

Note that trace events are counted globally across all running guests.

OPTIONS
-------
-1::
--once::
--batch::
	run in batch mode for one second

-l::
--log::
	run in logging mode (like vmstat)

-t::
--tracepoints::
	retrieve statistics from tracepoints

-d::
--debugfs::
	retrieve statistics from debugfs

-f<fields>::
--fields=<fields>::
	fields to display (regex)

-h::
--help::

  show help message

SEE ALSO
--------
perf(1), trace-cmd(1)

AUTHOR
------
Stefan Hajnoczi <stefanha@redhat.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] Improve KVM per VM monitoring
  2016-05-23 14:07 ` [PATCH v2 0/5] Improve KVM per VM monitoring Paolo Bonzini
@ 2016-05-24  8:17   ` Janosch Frank
  2016-05-24  8:50     ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Janosch Frank @ 2016-05-24  8:17 UTC (permalink / raw)
  To: Paolo Bonzini, kvm; +Cc: dan.carpenter, frankja

On 05/23/2016 04:07 PM, Paolo Bonzini wrote:
> On 18/05/2016 13:26, Janosch Frank wrote:
>> This patchset introduces KVM per VM exit statistics monitoring via
>> debugfs, as well as moves a tool to display VM statistics from qemu to
>> tools/.
>>
>> The new debugfs per VM statistics are an alternative to the already
>> available VM tracepoints. They are easier to read and have low
>> overhead.
>>
>> The kvm_stat python script is moved to the kernel, as we can make sure
>> here that the right version of the script is used with the right
>> kernel version. This is not given for qemu, as it supports a wide
>> range of linux kernel versions.
> 
> The patches look good, but you are not moving over the documentation.
> 
> I started looking at asciidoc, but perf's documentation Makefiles seem
> to be overkill for our purposes.  (My prototype conversion of the docs
> to asciidoc after my signature).
> 
> Perhaps pod2man could be an alternative (using QEMU's texi2pod to
> build the kvm_stat.pod input)?

The script already outputs a help text, which could be extended to the
man's text. I.e. I left it out on purpose.

Anyway, creating the manpage from asciidoc sources is a matter of having
the right packages and simply calling:
a2x --doctype manpage --format manpage file.txt

Works flawlessly with your example and is much more readable than the
texi source. I would give it a try and add it to the first patch if you
do not have any concerns?

The makefile would come down to (feel free to improve, I rarely write
makefiles):

include ../../scripts/Makefile.include
include ../../scripts/utilities.mak
BINDIR=usr/bin
MANDIR=usr/share/man/man1
ASCIIDOC=asciidoc
asciidoc_path := $(call get-executable,$(ASCIIDOC))

check-asciidoc:
ifeq ($(asciidoc_path),)
	$(error "You need to install asciidoc for man pages")
endif

man: check-asciidoc
	a2x --doctype manpage --format manpage kvm_stat.txt

install-man: man
	install -d -m 755 $(INSTALL_ROOT)/$(MANDIR)
	install -m 644 kvm_stat.1 $(INSTALL_ROOT)/$(MANDIR)

install:
	mkdir -p $(INSTALL_ROOT)/$(BINDIR)
	install -m 755 -p "kvm_stat" "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"

> 
> Paolo
> 
> 
> kvm_stat(1)
> ===========
> 
> NAME
> ----
> kvm_stat - Report KVM kernel module event counters
> 
> SYNOPSIS
> --------
> [verse]
> 'kvm_stat' [OPTION]...
> 
> DESCRIPTION
> -----------
> kvm_stat prints counts of KVM kernel module trace events.  These events signify
> state transitions such as guest mode entry and exit.
> 
> This tool is useful for observing guest behavior from the host perspective.
> Often conclusions about performance or buggy behavior can be drawn from the
> output.
> 
> The set of KVM kernel module trace events may be specific to the kernel version
> or architecture.  It is best to check the KVM kernel module source code for the
> meaning of events.
> 
> Note that trace events are counted globally across all running guests.
Let's not forget this for the pid monitoring patch:

Events can be fetched globally for all guests or for single guests only.

> 
> OPTIONS
> -------
> -1::
> --once::
> --batch::
> 	run in batch mode for one second
> 
> -l::
> --log::
> 	run in logging mode (like vmstat)
> 
> -t::
> --tracepoints::
> 	retrieve statistics from tracepoints
> 
> -d::
> --debugfs::
> 	retrieve statistics from debugfs
Let's not forget this for the pid monitoring patch:

-p<pid>::
--pid=<pid>::
	restrict statistics to pid

> 
> -f<fields>::
> --fields=<fields>::
> 	fields to display (regex)
> 
> -h::
> --help::
> 
>   show help message
> 
> SEE ALSO
> --------
> perf(1), trace-cmd(1)
> 
> AUTHOR
> ------
> Stefan Hajnoczi <stefanha@redhat.com>
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] Improve KVM per VM monitoring
  2016-05-24  8:17   ` Janosch Frank
@ 2016-05-24  8:50     ` Paolo Bonzini
  2016-05-24  8:57       ` Janosch Frank
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2016-05-24  8:50 UTC (permalink / raw)
  To: Janosch Frank, kvm; +Cc: dan.carpenter



On 24/05/2016 10:17, Janosch Frank wrote:
> The script already outputs a help text, which could be extended to the
> man's text. I.e. I left it out on purpose.
> 
> Anyway, creating the manpage from asciidoc sources is a matter of having
> the right packages and simply calling:
> a2x --doctype manpage --format manpage file.txt
> 
> Works flawlessly with your example and is much more readable than the
> texi source. I would give it a try and add it to the first patch if you
> do not have any concerns?

I will include the following as a separate patch:

---------------- 8< ---------------
>From ccceb628a51e52a4b6384e6ef1cc9d88daf00a62 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Tue, 24 May 2016 10:41:15 +0200
Subject: [PATCH] tools: Add kvm_stat man page

Converted from the Texinfo source in QEMU to asciidoc.  The a2x
incantation was provided by Frank Janosch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tools/kvm/kvm_stat/Makefile     | 40 ++++++++++++++++++++++++--
 tools/kvm/kvm_stat/kvm_stat.txt | 62 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 tools/kvm/kvm_stat/kvm_stat.txt

diff --git a/tools/kvm/kvm_stat/Makefile b/tools/kvm/kvm_stat/Makefile
index c639b8d30688..5b1cba57e3b3 100644
--- a/tools/kvm/kvm_stat/Makefile
+++ b/tools/kvm/kvm_stat/Makefile
@@ -1,5 +1,41 @@
+include ../../scripts/Makefile.include
+include ../../scripts/utilities.mak
 BINDIR=usr/bin
+MANDIR=usr/share/man
+MAN1DIR=$(MANDIR)/man1
 
-install:
-	mkdir -p $(INSTALL_ROOT)/$(BINDIR)
+MAN1=kvm_stat.1
+
+A2X=a2x
+a2x_path := $(call get-executable,$(A2X))
+
+all: man
+
+ifneq ($(findstring $(MAKEFLAGS),s),s)
+  ifneq ($(V),1)
+     QUIET_A2X = @echo '  A2X     '$@;
+  endif
+endif
+
+%.1: %.txt
+ifeq ($(a2x_path),)
+	$(error "You need to install asciidoc for man pages")
+else
+	$(QUIET_A2X)$(A2X) --doctype manpage --format manpage $<
+endif
+
+clean:
+	rm -f $(MAN1)
+
+man: $(MAN1)
+
+install-man: man
+	install -d -m 755 $(INSTALL_ROOT)/$(MAN1DIR)
+	install -m 644 kvm_stat.1 $(INSTALL_ROOT)/$(MAN1DIR)
+
+install-tools:
+	install -d -m 755 $(INSTALL_ROOT)/$(BINDIR)
 	install -m 755 -p "kvm_stat" "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"
+
+install: install-tools install-man
+.PHONY: all clean man install-tools install-man install
diff --git a/tools/kvm/kvm_stat/kvm_stat.txt b/tools/kvm/kvm_stat/kvm_stat.txt
new file mode 100644
index 000000000000..039dee80ddcb
--- /dev/null
+++ b/tools/kvm/kvm_stat/kvm_stat.txt
@@ -0,0 +1,62 @@
+kvm_stat(1)
+===========
+
+NAME
+----
+kvm_stat - Report KVM kernel module event counters
+
+SYNOPSIS
+--------
+[verse]
+'kvm_stat' [OPTION]...
+
+DESCRIPTION
+-----------
+kvm_stat prints counts of KVM kernel module trace events.  These events signify
+state transitions such as guest mode entry and exit.
+
+This tool is useful for observing guest behavior from the host perspective.
+Often conclusions about performance or buggy behavior can be drawn from the
+output.
+
+The set of KVM kernel module trace events may be specific to the kernel version
+or architecture.  It is best to check the KVM kernel module source code for the
+meaning of events.
+
+Note that trace events are counted globally across all running guests.
+
+OPTIONS
+-------
+-1::
+--once::
+--batch::
+	run in batch mode for one second
+
+-l::
+--log::
+	run in logging mode (like vmstat)
+
+-t::
+--tracepoints::
+	retrieve statistics from tracepoints
+
+-d::
+--debugfs::
+	retrieve statistics from debugfs
+
+-f<fields>::
+--fields=<fields>::
+	fields to display (regex)
+
+-h::
+--help::
+
+  show help message
+
+SEE ALSO
+--------
+'perf'(1), 'trace-cmd'(1)
+
+AUTHOR
+------
+Stefan Hajnoczi <stefanha@redhat.com>
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] Improve KVM per VM monitoring
  2016-05-24  8:50     ` Paolo Bonzini
@ 2016-05-24  8:57       ` Janosch Frank
  2016-05-24 10:09         ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Janosch Frank @ 2016-05-24  8:57 UTC (permalink / raw)
  To: Paolo Bonzini, kvm; +Cc: dan.carpenter, frankja

On 05/24/2016 10:50 AM, Paolo Bonzini wrote:
> On 24/05/2016 10:17, Janosch Frank wrote:
>> The script already outputs a help text, which could be extended to the
>> man's text. I.e. I left it out on purpose.
>>
>> Anyway, creating the manpage from asciidoc sources is a matter of having
>> the right packages and simply calling:
>> a2x --doctype manpage --format manpage file.txt
>>
>> Works flawlessly with your example and is much more readable than the
>> texi source. I would give it a try and add it to the first patch if you
>> do not have any concerns?
> 
> I will include the following as a separate patch:

Great, thanks!
Don't forget the manpage changes for the pid monitoring, i.e. the
-p/--pid argument and the removal of the note.

> 
> ---------------- 8< ---------------
> From ccceb628a51e52a4b6384e6ef1cc9d88daf00a62 Mon Sep 17 00:00:00 2001
> From: Paolo Bonzini <pbonzini@redhat.com>
> Date: Tue, 24 May 2016 10:41:15 +0200
> Subject: [PATCH] tools: Add kvm_stat man page
> 
> Converted from the Texinfo source in QEMU to asciidoc.  The a2x
> incantation was provided by Frank Janosch.

s/Frank Janosch/Janosch Frank/
Janosch is my first name.

> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  tools/kvm/kvm_stat/Makefile     | 40 ++++++++++++++++++++++++--
>  tools/kvm/kvm_stat/kvm_stat.txt | 62 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 100 insertions(+), 2 deletions(-)
>  create mode 100644 tools/kvm/kvm_stat/kvm_stat.txt
> 
> diff --git a/tools/kvm/kvm_stat/Makefile b/tools/kvm/kvm_stat/Makefile
> index c639b8d30688..5b1cba57e3b3 100644
> --- a/tools/kvm/kvm_stat/Makefile
> +++ b/tools/kvm/kvm_stat/Makefile
> @@ -1,5 +1,41 @@
> +include ../../scripts/Makefile.include
> +include ../../scripts/utilities.mak
>  BINDIR=usr/bin
> +MANDIR=usr/share/man
> +MAN1DIR=$(MANDIR)/man1
> 
> -install:
> -	mkdir -p $(INSTALL_ROOT)/$(BINDIR)
> +MAN1=kvm_stat.1
> +
> +A2X=a2x
> +a2x_path := $(call get-executable,$(A2X))
> +
> +all: man
> +
> +ifneq ($(findstring $(MAKEFLAGS),s),s)
> +  ifneq ($(V),1)
> +     QUIET_A2X = @echo '  A2X     '$@;
> +  endif
> +endif
> +
> +%.1: %.txt
> +ifeq ($(a2x_path),)
> +	$(error "You need to install asciidoc for man pages")
> +else
> +	$(QUIET_A2X)$(A2X) --doctype manpage --format manpage $<
> +endif
> +
> +clean:
> +	rm -f $(MAN1)
> +
> +man: $(MAN1)
> +
> +install-man: man
> +	install -d -m 755 $(INSTALL_ROOT)/$(MAN1DIR)
> +	install -m 644 kvm_stat.1 $(INSTALL_ROOT)/$(MAN1DIR)
> +
> +install-tools:
> +	install -d -m 755 $(INSTALL_ROOT)/$(BINDIR)
>  	install -m 755 -p "kvm_stat" "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"
> +
> +install: install-tools install-man
> +.PHONY: all clean man install-tools install-man install
> diff --git a/tools/kvm/kvm_stat/kvm_stat.txt b/tools/kvm/kvm_stat/kvm_stat.txt
> new file mode 100644
> index 000000000000..039dee80ddcb
> --- /dev/null
> +++ b/tools/kvm/kvm_stat/kvm_stat.txt
> @@ -0,0 +1,62 @@
> +kvm_stat(1)
> +===========
> +
> +NAME
> +----
> +kvm_stat - Report KVM kernel module event counters
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'kvm_stat' [OPTION]...
> +
> +DESCRIPTION
> +-----------
> +kvm_stat prints counts of KVM kernel module trace events.  These events signify
> +state transitions such as guest mode entry and exit.
> +
> +This tool is useful for observing guest behavior from the host perspective.
> +Often conclusions about performance or buggy behavior can be drawn from the
> +output.
> +
> +The set of KVM kernel module trace events may be specific to the kernel version
> +or architecture.  It is best to check the KVM kernel module source code for the
> +meaning of events.
> +
> +Note that trace events are counted globally across all running guests.
> +
> +OPTIONS
> +-------
> +-1::
> +--once::
> +--batch::
> +	run in batch mode for one second
> +
> +-l::
> +--log::
> +	run in logging mode (like vmstat)
> +
> +-t::
> +--tracepoints::
> +	retrieve statistics from tracepoints
> +
> +-d::
> +--debugfs::
> +	retrieve statistics from debugfs
> +
> +-f<fields>::
> +--fields=<fields>::
> +	fields to display (regex)
> +
> +-h::
> +--help::
> +
> +  show help message
> +
> +SEE ALSO
> +--------
> +'perf'(1), 'trace-cmd'(1)
> +
> +AUTHOR
> +------
> +Stefan Hajnoczi <stefanha@redhat.com>
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] Improve KVM per VM monitoring
  2016-05-24  8:57       ` Janosch Frank
@ 2016-05-24 10:09         ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2016-05-24 10:09 UTC (permalink / raw)
  To: Janosch Frank, kvm; +Cc: dan.carpenter



On 24/05/2016 10:57, Janosch Frank wrote:
>> Converted from the Texinfo source in QEMU to asciidoc.  The a2x
>> incantation was provided by Frank Janosch.
> 
> s/Frank Janosch/Janosch Frank/
> Janosch is my first name.

Oops, sorry.  I must not be the first to make this mistake. :)

Paolo

>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  tools/kvm/kvm_stat/Makefile     | 40 ++++++++++++++++++++++++--
>>  tools/kvm/kvm_stat/kvm_stat.txt | 62 +++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 100 insertions(+), 2 deletions(-)
>>  create mode 100644 tools/kvm/kvm_stat/kvm_stat.txt
>>
>> diff --git a/tools/kvm/kvm_stat/Makefile b/tools/kvm/kvm_stat/Makefile
>> index c639b8d30688..5b1cba57e3b3 100644
>> --- a/tools/kvm/kvm_stat/Makefile
>> +++ b/tools/kvm/kvm_stat/Makefile
>> @@ -1,5 +1,41 @@
>> +include ../../scripts/Makefile.include
>> +include ../../scripts/utilities.mak
>>  BINDIR=usr/bin
>> +MANDIR=usr/share/man
>> +MAN1DIR=$(MANDIR)/man1
>>
>> -install:
>> -	mkdir -p $(INSTALL_ROOT)/$(BINDIR)
>> +MAN1=kvm_stat.1
>> +
>> +A2X=a2x
>> +a2x_path := $(call get-executable,$(A2X))
>> +
>> +all: man
>> +
>> +ifneq ($(findstring $(MAKEFLAGS),s),s)
>> +  ifneq ($(V),1)
>> +     QUIET_A2X = @echo '  A2X     '$@;
>> +  endif
>> +endif
>> +
>> +%.1: %.txt
>> +ifeq ($(a2x_path),)
>> +	$(error "You need to install asciidoc for man pages")
>> +else
>> +	$(QUIET_A2X)$(A2X) --doctype manpage --format manpage $<
>> +endif
>> +
>> +clean:
>> +	rm -f $(MAN1)
>> +
>> +man: $(MAN1)
>> +
>> +install-man: man
>> +	install -d -m 755 $(INSTALL_ROOT)/$(MAN1DIR)
>> +	install -m 644 kvm_stat.1 $(INSTALL_ROOT)/$(MAN1DIR)
>> +
>> +install-tools:
>> +	install -d -m 755 $(INSTALL_ROOT)/$(BINDIR)
>>  	install -m 755 -p "kvm_stat" "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"
>> +
>> +install: install-tools install-man
>> +.PHONY: all clean man install-tools install-man install
>> diff --git a/tools/kvm/kvm_stat/kvm_stat.txt b/tools/kvm/kvm_stat/kvm_stat.txt
>> new file mode 100644
>> index 000000000000..039dee80ddcb
>> --- /dev/null
>> +++ b/tools/kvm/kvm_stat/kvm_stat.txt
>> @@ -0,0 +1,62 @@
>> +kvm_stat(1)
>> +===========
>> +
>> +NAME
>> +----
>> +kvm_stat - Report KVM kernel module event counters
>> +
>> +SYNOPSIS
>> +--------
>> +[verse]
>> +'kvm_stat' [OPTION]...
>> +
>> +DESCRIPTION
>> +-----------
>> +kvm_stat prints counts of KVM kernel module trace events.  These events signify
>> +state transitions such as guest mode entry and exit.
>> +
>> +This tool is useful for observing guest behavior from the host perspective.
>> +Often conclusions about performance or buggy behavior can be drawn from the
>> +output.
>> +
>> +The set of KVM kernel module trace events may be specific to the kernel version
>> +or architecture.  It is best to check the KVM kernel module source code for the
>> +meaning of events.
>> +
>> +Note that trace events are counted globally across all running guests.
>> +
>> +OPTIONS
>> +-------
>> +-1::
>> +--once::
>> +--batch::
>> +	run in batch mode for one second
>> +
>> +-l::
>> +--log::
>> +	run in logging mode (like vmstat)
>> +
>> +-t::
>> +--tracepoints::
>> +	retrieve statistics from tracepoints
>> +
>> +-d::
>> +--debugfs::
>> +	retrieve statistics from debugfs
>> +
>> +-f<fields>::
>> +--fields=<fields>::
>> +	fields to display (regex)
>> +
>> +-h::
>> +--help::
>> +
>> +  show help message
>> +
>> +SEE ALSO
>> +--------
>> +'perf'(1), 'trace-cmd'(1)
>> +
>> +AUTHOR
>> +------
>> +Stefan Hajnoczi <stefanha@redhat.com>
>>
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-05-24 10:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-18 11:26 [PATCH v2 0/5] Improve KVM per VM monitoring Janosch Frank
2016-05-18 11:26 ` [PATCH v2 1/5] tools: Add kvm_stat vm monitor script Janosch Frank
2016-05-18 11:26 ` [PATCH v2 2/5] MAINTAINERS: Add kvm tools Janosch Frank
2016-05-18 11:26 ` [PATCH v2 3/5] KVM: Create debugfs dir and stat files for each VM Janosch Frank
2016-05-18 11:26 ` [PATCH v2 4/5] tools: kvm_stat: Introduce pid monitoring Janosch Frank
2016-05-18 11:26 ` [PATCH v2 5/5] tools: kvm_stat: Add documentation Janosch Frank
2016-05-23 14:07 ` [PATCH v2 0/5] Improve KVM per VM monitoring Paolo Bonzini
2016-05-24  8:17   ` Janosch Frank
2016-05-24  8:50     ` Paolo Bonzini
2016-05-24  8:57       ` Janosch Frank
2016-05-24 10:09         ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox