From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751942AbdBFMWk (ORCPT ); Mon, 6 Feb 2017 07:22:40 -0500 Received: from mail-wj0-f194.google.com ([209.85.210.194]:33934 "EHLO mail-wj0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751471AbdBFMWj (ORCPT ); Mon, 6 Feb 2017 07:22:39 -0500 Date: Mon, 6 Feb 2017 13:22:31 +0100 From: Ingo Molnar To: Borislav Petkov Cc: Arnaldo Carvalho de Melo , Peter Zijlstra , Robert Richter , Vince Weaver , lkml Subject: Re: [RFC PATCH] perf/stat: Add --disable-hwdt Message-ID: <20170206122231.GA9404@gmail.com> References: <20170206121506.mtknwusus4djp2sx@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170206121506.mtknwusus4djp2sx@pd.tnic> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Borislav Petkov wrote: > Hi guys, > > so I've been tracing recently on an AMD F15h which has those funky counter > constraints and am seeing this: > > # ./perf stat sleep 1 > > Performance counter stats for 'sleep 1': > > 0.749208 task-clock (msec) # 0.001 CPUs utilized > 1 context-switches # 0.001 M/sec > 0 cpu-migrations # 0.000 K/sec > 54 page-faults # 0.072 M/sec > 1,122,815 cycles # 1.499 GHz > 286,740 stalled-cycles-frontend # 25.54% frontend cycles idle > stalled-cycles-backend (0.00%) > ^^^^^^^^^^^^ > instructions (0.00%) > ^^^^^^^^^^^^ > branches (0.00%) > branch-misses (0.00%) > > 1.001550070 seconds time elapsed > > > The problem is that the HW watchdog thing is already taking up a > counter so when perf stat uses the default counters and when we reach > stalled-cycles-backend, we run out of counters for the remaining events. > > So how about something like this: > > # ./perf stat --disable-hwdt sleep 1 > > Performance counter stats for 'sleep 1': > > 0.782552 task-clock (msec) # 0.001 CPUs utilized > 1 context-switches # 0.001 M/sec > 0 cpu-migrations # 0.000 K/sec > 55 page-faults # 0.070 M/sec > 1,163,246 cycles # 1.486 GHz > 293,598 stalled-cycles-frontend # 25.24% frontend cycles idle > 400,017 stalled-cycles-backend # 34.39% backend cycles idle > 676,505 instructions # 0.58 insn per cycle > # 0.59 stalled cycles per insn > 133,822 branches # 171.007 M/sec > 7,319 branch-misses # 5.47% of all branches > > 1.001660058 seconds time elapsed > > We did explore other opportunities on IRC like sharing counters or > making the HW WDT thing a 'soft' counter but all those are nasty and > probably not really worth the trouble of touching perf core just so that > this works. > > Besides, future generations don't have those constraints anymore so it > is only F15h. > > Below is a silly patch as a syntactic sugar helper for perf stat. This > is just an RFC anyway, I'll do it properly with fopen() if you're ok > with the approach. Looks sensible, and I'd in fact make this the new default behavior (if root runs perf stat) - i.e. add a flag to re-enable it, for the rare case where we want to debug a hard deadlock while running perf stat ... Thanks, Ingo