From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755130Ab3K0Loi (ORCPT ); Wed, 27 Nov 2013 06:44:38 -0500 Received: from merlin.infradead.org ([205.233.59.134]:47941 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754024Ab3K0LoG (ORCPT ); Wed, 27 Nov 2013 06:44:06 -0500 Date: Wed, 27 Nov 2013 12:43:54 +0100 From: Peter Zijlstra To: christophe leroy Cc: Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , "linux-kernel@vger.kernel.org" Subject: Re: perf events: how to implement TLB misses as SW event ? Message-ID: <20131127114354.GZ10022@twins.programming.kicks-ass.net> References: <52924132.2080209@c-s.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52924132.2080209@c-s.fr> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 24, 2013 at 07:10:58PM +0100, christophe leroy wrote: > Today in the perfevents subsystem it looks like DTLB/ITLB misses are > implemented as HW counter only. > On some processors, like PowerPC 8xx, there is no counter for that. However > DTLB/ITLB misses are handled as exceptions via software, so we have an > opportunity to implement a SW counter for that. > What's the easiest/best way to implement it ? The very easiest way would be to place a trace_event in your tlb miss handler. _HOWEVER_ I suspect you're quite limited in what you can do from an actual tlb miss handler -- seeing as generating another miss from a miss-handler might be a bad thing. So I suspect that'll not actually work. You'll need to educate me a bit on what you can (and can not) do from a tlb miss handler on your platforms. I'm assuming you have a limited fixed memory map to work from while in the tlb fault handler -- you need something like that for such a handler to run. So one possible option would be to keep a free-running counter in whatever fixed mapped memory region you have, and read from that. This would allow you to implement a non-sampling software pmu for these events. If you have means of raising an interrupt; or other means of causing some routine to run in a 'normal' context. You could implement a more complete counter. For example, you could decrement your fixed map counter on each tlb miss, and when it hits 0, write some state -- eg. the instruction pointer of the instruction that caused the tlb miss -- into a related field in the fixed map and raise the interrupt. You can then use this mechanism from either your hardware pmu driver or a separate pmu implementation as a 'regular' fixed purpose counter. Since its all software, you can even implement whatever filters your hardware counters have to match its capabilities.