From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752722AbbCJMyQ (ORCPT ); Tue, 10 Mar 2015 08:54:16 -0400 Received: from casper.infradead.org ([85.118.1.10]:37966 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751083AbbCJMyP (ORCPT ); Tue, 10 Mar 2015 08:54:15 -0400 Date: Tue, 10 Mar 2015 13:53:51 +0100 From: Peter Zijlstra To: Mark Rutland Cc: Suzuki Poulose , Will Deacon , "linux@arm.linux.org.uk" , "acme@kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , Punit Agrawal , Pawel Moll Subject: Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs Message-ID: <20150310125351.GD2896@worktop.programming.kicks-ass.net> References: <1425905192-10509-1-git-send-email-suzuki.poulose@arm.com> <1425905192-10509-2-git-send-email-suzuki.poulose@arm.com> <20150310112723.GY2896@worktop.programming.kicks-ass.net> <20150310120521.GD28168@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150310120521.GD28168@leverpostej> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2015 at 12:05:21PM +0000, Mark Rutland wrote: > On Tue, Mar 10, 2015 at 11:27:23AM +0000, Peter Zijlstra wrote: > > On Mon, Mar 09, 2015 at 12:46:30PM +0000, Suzuki K. Poulose wrote: > > > From: "Suzuki K. Poulose" > > > > > > Don't allow grouping hardware events from different PMUs > > > (eg. CCI + CPU). > > > > Uhm, how does this work? If we have multiple hardware PMUs we'll stop > > scheduling events after the first failed event schedule. This can leave > > one of the PMUs severely under utilized. > > The problem is here group validation at pmu::event_init() time, not > scheduling. Maybe make that a little more explicit. > We don't allow grouping across disparate HW PMUs because we can't > provide group semantics anyway. Scheduling is not a problem in this case > (unlike the big.LITTLE case I have a patch for [1]). Right, I remember that; I was wondering if this was related. > We have a CPU PMU and an "uncore" CCI PMU. You can't create task-bound > events for the CCI, but you can create CPU-bound events for the CCI on > the nominal CPU the CCI is monitored from. Indeed, ok. > The context check you added in c3c87e770458aa00 "perf: Tighten (and fix) > the grouping condition" implicitly rejects groups that have CPU and CCI > events (each event::ctx will be the relevant pmu::pmu_cpu_context and > will differ), and this is sane -- you can't provide group semantics > across disparate HW PMUs. Agreed. > Unfortunately that happens after we've done the > event->pmu->event_init(event) dance on each event, and in our event_init > function we try to verify the group is sane. In our verification we > ignore SW events, but assume that all !SW events are for the CPU PMU. > If you add a CPU event to a CCI group, that's not the case, and we use > container_of on an unsuitable object, derefence garbage, invoke the > eschaton and so on. Indeed, on x86 we explicitly ignore everything not an x86_pmu event. > It would be nicer if we could prevent this in the core so we're not > reliant on every PMU driver doing the same verification. My initial > thought was that seemed like unnecessary duplication of the ctx checking > above, but if we're going to end up shoving it into several drivers > anyway perhaps it's the lesser evil. Again, agreed, that would be better and less error prone. But I'm not entirely sure how to go about doing it :/ I'll have to go think about that; and conferences are not the best place for that. Suggestions on that are welcome of course ;)