From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Wilson Subject: Re: INSTDONE instrumentation (patch in progress) Date: Sun, 31 Oct 2010 08:10:46 +0000 Message-ID: References: <1288491846.2886.12.camel@pcjc2lap> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id B6A5D9E760 for ; Sun, 31 Oct 2010 01:10:49 -0700 (PDT) In-Reply-To: <1288491846.2886.12.camel@pcjc2lap> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Peter Clifton , "intel-gfx@lists.freedesktop.org" List-Id: intel-gfx@lists.freedesktop.org On Sun, 31 Oct 2010 02:24:06 +0000, Peter Clifton wrote: > Hi guys, > > I thought I'd attach this, as it is now gone 2AM and I doubt I'm going > to finish it "tonight". I was hoping to elicit some initial review to > suggest whether the design was sane or not. Been here, done something similar and ripped it out. What we want to do is integrate an additional set of sampling with perf. The last time I looked, it required a few extra lines to perf-core to allow devices to register their own counters, and then you get the advantage of a reasonable interface (plus the integration with CPU profiling and timecharts etc). > I'd originally imagined tying the profiling lifetime to the execution / > completion of individual batch-buffers, but for now I'd like to get it > partly working like this, and perhaps develop some user-space program to > view the results and see if they make sense. You can use the current trace points to get timings for submit + complete + retire. What's missing here is being able to mark individual batch buffers for profiling. I think adding a new TIMING_FENCE ioctl (it could just be a fence ;-) that capture various stats at the point of submission and completion and then fired off an event (to be read on the /dev/dri/card0 fd) would be the more flexible solution. -Chris -- Chris Wilson, Intel Open Source Technology Centre