From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Tvrtko Ursulin <tursulin@ursulin.net>,
igt-dev@lists.freedesktop.org
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Subject: Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
Date: Thu, 1 Feb 2018 16:58:21 +0000 [thread overview]
Message-ID: <dd89c8b3-2a01-fad3-54aa-4b229245bf76@linux.intel.com> (raw)
In-Reply-To: <151750318740.28099.6565723462096485307@mail.alporthouse.com>
On 01/02/2018 16:39, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 16:26:58)
>>
>> On 01/02/2018 12:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-02-01 12:47:44)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> In cases where we manually terminate the busy batch, we always want to
>>>> sample busyness while the batch is running, just before we will
>>>> terminate it, and not the other way around. This way we make the window
>>>> for unwated idleness getting sampled smaller.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> tests/perf_pmu.c | 28 +++++++++++++---------------
>>>> 1 file changed, 13 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
>>>> index 2f7d33414a53..bf16e5e8b1f9 100644
>>>> --- a/tests/perf_pmu.c
>>>> +++ b/tests/perf_pmu.c
>>>> @@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
>>>> spin = NULL;
>>>>
>>>> slept = measured_usleep(batch_duration_ns / 1000);
>>>> - igt_spin_batch_end(spin);
>>>> -
>>>> val = pmu_read_single(fd);
>>>>
>>>> + igt_spin_batch_end(spin);
>>>
>>> But that's the wrong way round as we are measuring busyness, and the
>>> sampling should terminate as soon as the spin-batch ends, before we even
>>> read the PMU sample? For the timer sampler, it's lost in the noise.
>>>
>>> So the idea was to cancel the busyness asap so that the sampler stops
>>> updating before we even have to cross into the kernel for the PMU read.
>>
>> I don't follow on the problem statement. This is how code used to looks
>> in many places:
>>
>> slept = measured_usleep(batch_duration_ns / 1000);
>> igt_spin_batch_end(spin);
>>
>> pmu_read_multi(fd[0], num_engines, val);
>>
>> Problem here is there is indeterministic time gap, depending on test
>> execution speed, between requesting the batch to end to reading the
>> counter. This can add a random amount of idleness to the read value.
>
> But we are not measuring idleness, but busyness. The batch will end a
Yep, and that's why I removed some idleness from the busyness! :) Plus
removed the time required to signal batch end from it as well.
> few tens of nanoseconds after the write hits memory. The desire is that
> time is zero so that the sleep exactly corresponds with the interval the
> batch was spinning (busy).
>
>> And another indeterminism in how long it takes for the batch end request
>> to get picked up by the GPU. If that is slower in some cases, the
>> counter will drift from the expected value toward other direction,
>> overestimating busyness relative to sleep duration.
>>
>> Attempted improvement was simply to reverse the last two lines, so we
>> read the counter when we know it is busy (after the sleep), and then
>> request batch termination.
>>
>> This only leaves the scheduling delay between end of sleep and counter
>> read, which is smaller than end of sleep - batch end - counter read.
>>
>> These tests are not testing for edge conditions, just that the busy
>> engines are reported as busy, in various combinations, so that sounded
>> like a reasonable change.
>>
>> I hope I did not get confused here, it wouldn't be the first time in
>> these tests...
>
> Aiui, the PMU measure busyness which is the duration of the spin-batch
> (or first enabling of the PMU event). The choice is between a context
> switch into the kernel to stop the counter, or a single write to memory
> to stop the batch).
>
> My bet is the write to memory will turn off the counter within 100ns,
> worst case including the interrupt processing. The PMU event stopping
> I estimate 100ns best case (including the kernel context switch, and
> don't ask how KPTI perturbs that switch as I don't know off hand how
> much worse it will get).
There is not PMU event stopping in the changed bits, so not being under
measure either before or after.
> Now, the write to memory is asynchronous to the PMU event stopping. So
> if you write first, whichever is processed first (the kernel context
> switch + event stop, or the CS interrupt) causes the busy counter to
> cease. So best of both worlds?
I am just moving the two sample read-outs closer together in time. Two
samples are measure of how long we slept, and measure of PMU busyness.
There should be as little time as possible between the two readouts.
Ending the spinner I don't see how it belongs in this sandwich, or
anything else really.
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
next prev parent reply other threads:[~2018-02-01 16:58 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement Tvrtko Ursulin
2018-02-01 12:57 ` Chris Wilson
2018-02-01 16:26 ` Tvrtko Ursulin
2018-02-01 16:39 ` Chris Wilson
2018-02-01 16:58 ` Tvrtko Ursulin [this message]
2018-02-01 17:08 ` Chris Wilson
2018-02-01 17:16 ` Chris Wilson
2018-02-01 17:34 ` Chris Wilson
2018-02-01 17:20 ` Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening Tvrtko Ursulin
2018-02-01 12:59 ` Chris Wilson
2018-02-01 16:37 ` Tvrtko Ursulin
2018-02-01 16:48 ` Chris Wilson
2018-02-01 17:02 ` Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests Tvrtko Ursulin
2018-02-01 17:37 ` Chris Wilson
2018-02-01 13:22 ` [igt-dev] ✓ Fi.CI.BAT: success for perf_pmu reliability improvements Patchwork
2018-02-01 16:38 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dd89c8b3-2a01-fad3-54aa-4b229245bf76@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=igt-dev@lists.freedesktop.org \
--cc=tursulin@ursulin.net \
--cc=tvrtko.ursulin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox