public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
To: "peterz@infradead.org" <peterz@infradead.org>
Cc: "Intel-gfx@lists.freedesktop.org" <Intel-gfx@lists.freedesktop.org>
Subject: Re: [RFC 04/10] drm/i915: Expose a PMU interface for perf queries
Date: Wed, 23 Aug 2017 17:51:38 +0000	[thread overview]
Message-ID: <1503481771.5383.31.camel@intel.com> (raw)
In-Reply-To: <20170822181755.GL32112@worktop.programming.kicks-ass.net>

On Tue, 2017-08-22 at 20:17 +0200, Peter Zijlstra wrote:
> On Sat, Aug 12, 2017 at 02:15:13AM +0000, Rogozhkin, Dmitry V wrote:
> > $ perf stat -e instructions,i915/rcs0-busy/ workload.sh
> > <... wrokload.sh output...>
> > 
> > Performance counter stats for 'workload.sh':
> >      1,204,616,268      instructions
> >                  0      i915/rcs0-busy/
> > 
> >        1.869169153 seconds time elapsed
> > 
> > As you can see instructions event works pretty well, i915/rcs0-busy/
> > doesn't.
> > 
> > I afraid that our current understanding of how PMU should work is not
> > fully correct.
> 
> Can we start off by explaining to me how this i915 stuff works. Because
> all I have is ~750 lines of patch without comments. Which sort of leaves
> me confused.

i915 PMU tries to expose a number of SW metrics to characterize i915
performance: i915 interrupt counts, time (counted in microseconds) GPU
spent in specific states like RC6, time  (counted in microseconds) GPU
engines were busy executing some tasks or were idle, etc. As for now
these metrics are i915 global and can't be attached to some particular
task, however, we may consider such metrics in the future. I think that
global PMU is the nice first step. I had a talk with Andi Kleen who
pointed that this is similar to how uncore PMU behaves and suggested I
should use 'perf stat -e <event> -a -C0' and probably adjust the code to
register i915 PMU with the perf_invalid_context. Which I did in the
patch which probably should be squashed to the patch from Tvrtko (I did
not want to intersect with his job). This change can be found here:
https://patchwork.freedesktop.org/patch/171953/. This patch makes 'perf
stat -e i915/rcs0-busy/' to error out and supports 'perf stat -e
i915/rcs0-busy/ -a -C0'. I still think I miss something since 'perf stat
-e i915/rcs0-busy/ -a' will give metrics multiplied by number of active
CPUs, but that's look closer to what is needed.

Anyhow, returning to the metrics i915 exposes. Some metrics are just
exposure of some counters supported already inside i915 PMU which do not
require any special sampling: at any given moment you can request the
counter value (these are interrupts counts, i915 power consumption).
Other metrics are similar to the ever-existing which I just described,
but they require activation for i915 to start to count them - this is
done on the event initialization (these are engine busy stats). Finally,
there is a third group which require sampling counting: they are needed
to be initialized and i915 pmu starts an internal timer to count these
values (these are some engines characteristics referenced in the code as
QUEUED, SEMA, WAIT).

I hope this clarifies. As well I hope I correctly covered high level
description of i915 PMU. I hope Chris will correct me if I was wrong
somewhere. Mind that another author of i915 PMU Tvrtko is for one more
week or so on vacation.


> 
> The above command tries to add an event 'i915/rcs0-busy/' to a task. How
> are i915 resource associated to any one particular task?

Currently in no way, they are global.

> 
> Is there a unique i915 resource for each task? If not, I don't see how
> per-task event can ever work as expected.

This depends on what you mean under "expected"? I see 2 variants:
1. Either this command line should return correct metric values
2. This command line should error out
Right now i915 PMU produces global metric, thus, I think 2nd variant is
true. Which I think is achievable if PMU is registered with
perf_invalid_context.

> 
> > I think so, because the way PMU entry points init(),
> > add(), del(), start(), stop(), read() are implemented do not correlate
> > with how many times they are called. I have counted them and here is the
> > result:
> > init()=19, add()=44310, del()=43900, start()=44534, stop()=0, read()=0
> > 
> > Which means that we are regularly attempt to start/stop timer and/or
> > busy stats calculations. Another thing which pay attention is that
> > read() was not called at all. How perf supposes to get counter value?
> 
> Both stop() and del() are supposed to update event->count. Only if we do
> sys_read() while the event is active (something perf-stat never does
> IIRC) will it issue pmu::read() to get an up-to-date number.
> 
> > Yet another thing, where we are supposed to initialize our internal
> > staff: numbers above are from single run and even init is called
> > multiple times? Where we are supposed to de-init our staff: each time on
> > del() - this hardly makes sense?
> 
> init happens in pmu::event_init(), that can set an optional
> event->destroy() function for de-init.
> 
> init() is called once for each event created, the above creates an
> inherited per-task event (I think, I lost track of what perf tool does)
> and 19 seems to suggest you did some 18 fork()/clone() calls after that,
> resulting in your 1 parent event with 18 children.

This behavior changed significantly once I have registered i915 PMU with
the perf_invalid_context notifying perf subsystem that out PMU produces
global metrics. Right now what I have is:
1. If I call "perf stat -e <event> -a -C0", then event is initialized
once
2. If I call "perf stat -e <event> -a", then event is initialized as
many times as there are logical CPUs. On my system there were 8.

And there is another problem associated with the "perf stat -e <event>
-a": since 8 events are initialized, I get metric values multiplied by
8. How I am supposed to fix that? I suspect that this is somehow related
to cpumask attribute exposed by some other PMUs, for example uncore.
They are doing something like:

static cpumask_t uncore_cpu_mask;

static DEVICE_ATTR(cpumask, S_IRUGO, uncore_get_attr_cpumask, NULL);

static struct attribute *uncore_pmu_attrs[] = {
	&dev_attr_cpumask.attr,
	NULL,
};

static struct attribute_group uncore_pmu_attr_group = {
	.attrs = uncore_pmu_attrs,
};

Should i915 follow this way as well?

> 
> > I should note that if perf will be issued with -I 10 option, then read()
> > is being called: init_c()=265, add_c()=132726, del_c()=131482,
> > start_c()=133412, stop()=0, read()=71. However, i915 counter is still 0.
> > I have tried to print counter values from within read() and these values
> > are non 0. Actually read() returns sequence of <non_zero>, 0, 0, 0, ...,
> > <no_zero> because with our add(), del() code we regularly start/stop our
> > counter and execution in read() follows different branches.
> > 
> > Thus, I think that right now we do not implement PMU correctly and do
> > not meet perf expectations from the PMU. Unfortunately, right now I have
> > no idea what are these expectations.
> 
> Please as to clarify how i915 works, I have no idea where to go.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-08-23 17:51 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-02 12:32 [RFC v2 00/10] i915 PMU and engine busy stats Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 01/10] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 02/10] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 03/10] drm/i915: Extract intel_get_cagf Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 04/10] drm/i915: Expose a PMU interface for perf queries Tvrtko Ursulin
2017-08-02 13:00   ` Tvrtko Ursulin
2017-08-12  2:15     ` Rogozhkin, Dmitry V
2017-08-22 18:17       ` Peter Zijlstra
2017-08-23 17:51         ` Rogozhkin, Dmitry V [this message]
2017-08-23 18:01           ` Peter Zijlstra
2017-08-23 18:40             ` Rogozhkin, Dmitry V
2017-08-23 18:04           ` Peter Zijlstra
2017-08-23 18:38             ` Rogozhkin, Dmitry V
2017-08-23 19:28               ` Peter Zijlstra
2017-08-23 18:05           ` Peter Zijlstra
2017-08-23 18:22           ` Peter Zijlstra
2017-08-23 19:00             ` Rogozhkin, Dmitry V
2017-08-23 19:33               ` Peter Zijlstra
2017-08-28 22:57             ` Rogozhkin, Dmitry V
2017-08-29  9:30               ` Peter Zijlstra
2017-08-29 19:16                 ` Rogozhkin, Dmitry V
2017-08-29 19:21                   ` Peter Zijlstra
2017-09-13 23:05                     ` Rogozhkin, Dmitry V
2017-09-20 20:14                       ` Rogozhkin, Dmitry V
2017-10-02 21:28                         ` Rogozhkin, Dmitry V
2017-08-02 12:32 ` [RFC 05/10] drm/i915/pmu: Suspend sampling when GPU is idle Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 06/10] drm/i915: Wrap context schedule notification Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 07/10] drm/i915: Engine busy time tracking Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 08/10] drm/i915: Export engine busy stats in debugfs Tvrtko Ursulin
2017-08-02 12:32 ` [RFC 09/10] drm/i915/pmu: Wire up engine busy stats to PMU Tvrtko Ursulin
2017-08-02 12:39 ` [RFC 10/10] drm/i915: Gate engine stats collection with a static key Tvrtko Ursulin
2017-08-02 12:56 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1503481771.5383.31.camel@intel.com \
    --to=dmitry.v.rogozhkin@intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox