From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753709AbcBZHhY (ORCPT ); Fri, 26 Feb 2016 02:37:24 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36559 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750869AbcBZHhX (ORCPT ); Fri, 26 Feb 2016 02:37:23 -0500 Date: Fri, 26 Feb 2016 08:37:18 +0100 From: Ingo Molnar To: Marty McFadden Cc: ak@linux.intel.com, andriy.shevchenko@linux.intel.com, bp@alien8.de, bp@suse.de, brgerst@gmail.com, dan.j.williams@intel.com, dyoung@redhat.com, hpa@zytor.com, linux@horizon.com, linux-kernel@vger.kernel.org, luto@kernel.org, mingo@redhat.com, pavel@ucw.cz, tglx@linutronix.de, viro@zeniv.linux.org.uk, x86@kernel.org, yu.c.chen@intel.com, Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa Subject: Re: [PATCH 0/4] MSR: MSR: MSR Whitelist and Batch Introduction Message-ID: <20160226073717.GA3884@gmail.com> References: <1456444979-224547-1-git-send-email-mcfadden8@llnl.gov> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1456444979-224547-1-git-send-email-mcfadden8@llnl.gov> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Marty McFadden wrote: > > This patch addresses the following two problems: > 1. The current msr module grants all-or-nothing access to MSRs, > thus making user-level runtime performance adjustments > problematic, particularly for power-constrained HPC systems. > > 2. The current msr module requires a separate system call and the > acquisition of the preemption lock for each individual MSR access. > This overhead degrades performance of runtime tools that would > ideally sample multiple MSRs at high frequencies. No, we really don't want to touch the old MSR code - it's a very opaque API with various deep limitations. What I'd like to see instead is to use a modern system monitoring interface - and in fact that already happened in the last kernel release, we added the perf MSR access methods via: commit b7b7c7821d932ba18ef6c8eafc8536066b4c2ef4 Author: Andy Lutomirski Date: Mon Jul 20 11:49:06 2015 -0400 perf/x86: Add an MSR PMU driver This patch adds an MSR PMU to support free running MSR counters. Such as time and freq related counters includes TSC, IA32_APERF, IA32_MPERF and IA32_PPERF, but also SMI_COUNT. The events are exposed in sysfs for use by perf stat and other tools. The files are under /sys/devices/msr/events/ see arch/x86/cpu/perf/msr.c, or arch/x86/events/msr.c in the latest perf tree: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core For example with the perf ABIs 'batch access' of a group of MSRs is easy: a group of events can be read or sampled at once. It can be done in a system-wide, per task or per task hierarchy fashion, with cgroup management as well - it's a modern API. Right now the MSR PMU code is only at its first version, with only these few MSRs exposed: enum perf_msr_id { PERF_MSR_TSC = 0, PERF_MSR_APERF = 1, PERF_MSR_MPERF = 2, PERF_MSR_PPERF = 3, PERF_MSR_SMI = 4, PERF_MSR_EVENT_MAX, }; but that can (and should) be expanded and more features can be added. Thanks, Ingo