From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [PATCH i-g-t 1/6] test/perf: Drop caches when closing perf stream
Date: Thu, 5 Mar 2020 01:29:50 +0200 [thread overview]
Message-ID: <8e2012f7-4e18-be2d-0226-458056fcd81b@intel.com> (raw)
In-Reply-To: <20200304230438.GA58678@orsosgc001.amr.corp.intel.com>
On 05/03/2020 01:04, Umesh Nerlige Ramappa wrote:
> On Thu, Mar 05, 2020 at 12:05:55AM +0200, Lionel Landwerlin wrote:
>> On 04/03/2020 19:51, Umesh Nerlige Ramappa wrote:
>>> On Wed, Mar 04, 2020 at 10:45:55AM +0200, Lionel Landwerlin wrote:
>>>> On 04/03/2020 00:57, Umesh Nerlige Ramappa wrote:
>>>>> On Tue, Mar 03, 2020 at 02:38:08PM -0800, Umesh Nerlige Ramappa
>>>>> wrote:
>>>>>> Running ./build/tests/perf will run all the perf subtests in a
>>>>>> sequence.
>>>>>> When running tests in a sequence, subsequent tests may not run
>>>>>> with a
>>>>>> clean slate. For resources that are lazily released, drop caches in
>>>>>> __perf_close.
>>>>>
>>>>> Hi Lionel, Chris,
>>>>>
>>>>> I notice an issue on TGL when running the entire suite of perf
>>>>> tests. In my setup, the polling test was failing with invalid
>>>>> reports being seen in the beginning of the OA buffer. This issue
>>>>> is seen more prominently with the newly added subtests which call
>>>>> perf_open and perf_close a couple of times (say
>>>>> blocking-with-interrupt).
>>>>>
>>>>> What I see in some runs is that the second test would result in a
>>>>> bunch of unlanded reports in the beginning of the OA buffer.
>>>>> Assuming that we are already waiting for the NOA config with a
>>>>> noa_wait bo, I tried to look into this further.
>>>>>
>>>>> free_oa_buffer is called to free the oa_buffer bo and this work is
>>>>> deferred by the driver. If a test is called before this free
>>>>> completes, we see the issue. Just to test out this theory, if I
>>>>> comment out the free_oa_buffer entirely, I see that the tests pass
>>>>> without any issues since new gtt memory is being allocated each time.
>>>>>
>>>>> I guess the deferred free and the new allocation of the OA buffer
>>>>> for subsequent test has something missing. Maybe TLBs not being
>>>>> dropped? I imagine the OA unit might write valid reports somewhere
>>>>> based on what it sees in the TLBs and cpu is looking for them
>>>>> elsewhere (until the free completes). Just a theory though. Let me
>>>>> know what you think.
>>>>>
>>>>> For now igt_drop_caches_set(DROP_FREED) is what is helping and
>>>>> hence this patch.
>>>>
>>>>
>>>> Hey Umesh,
>>>>
>>>>
>>>> I guess this could be fixed by this commit :
>>>>
>>>>
>>>> commit 4b4e973d5eb89244b67d3223b60f752d0479f253
>>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Date: Mon Mar 2 08:57:57 2020 +0000
>>>>
>>>> drm/i915/perf: Reintroduce wait on OA configuration completion
>>>>
>>>> If you can give this commit a try or rebase on drm-tip it would be
>>>> great to confirm.
>>>
>>> I thought this commit was ensuring that the noa_wait is executed
>>> completely before we enable the OA buffer captures. That still does
>>> not explain why the issue goes away for me when I comment out
>>> free_oa_buffer.
>>
>>
>> If noa_wait is not waited upon, either from CPU or GPU, then we
>> enable OA while the configuration is not completely applied.
>>
>> Hence the invalid data at the beginning of the buffer.
>>
>>
>> Are you saying this commit didn't help?
>>
> No, I haven't tried it yet. I lost my reservation on the TGL machine
> :(, so waiting for another one.
>
> What I meant is that - not freeing the OA buffer (only for
> experimenting) results in a new gtt offset everytime we allocate the
> OA buffer. When I do this, I don't see any invalid OA reports. This is
> without the commit you mention above. If waiting for the NOA config to
> complete were indeed the issue, I should have seen it even with my
> experiment. Right?
I see, thanks that's a useful experiment.
Only thing I can think of would be HEAD/TAIL register writes that didn't
land before the OA unit was turned on.
I'll dig into the code.
Is this only on Gen12?
Thanks,
-Lionel
>
> Thanks,
> Umesh
>
>>
>> Thanks,
>>
>>
>> -Lionel
>>
>>
>>>
>>> Thanks,
>>> Umesh
>>>
>>>>
>>>> Otherwise we might need more digging to figure what's going on.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> -Lionel
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Umesh
>>>>>
>>>>>>
>>>>>> Signed-off-by: Umesh Nerlige Ramappa
>>>>>> <umesh.nerlige.ramappa@intel.com>
>>>>>> ---
>>>>>> tests/perf.c | 7 ++++++-
>>>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/tests/perf.c b/tests/perf.c
>>>>>> index 5e818030..189c6aa1 100644
>>>>>> --- a/tests/perf.c
>>>>>> +++ b/tests/perf.c
>>>>>> @@ -244,6 +244,12 @@ __perf_close(int fd)
>>>>>> close(pm_fd);
>>>>>> pm_fd = -1;
>>>>>> }
>>>>>> +
>>>>>> + /* When running tests in a sequence, subsequent tests may
>>>>>> not run with a
>>>>>> + * clean slate. For resources that are lazily released,
>>>>>> cleanup here.
>>>>>> + */
>>>>>> + if (drm_fd >= 0 && !getgid() && !getuid())
>>>>>> + gem_quiescent_gpu(drm_fd);
>>>>>> }
>>>>>>
>>>>>> static int
>>>>>> @@ -3993,7 +3999,6 @@ test_rc6_disable(void)
>>>>>> igt_assert_eq(n_events_end - n_events_start, 0);
>>>>>>
>>>>>> __perf_close(stream_fd);
>>>>>> - gem_quiescent_gpu(drm_fd);
>>>>>>
>>>>>> n_events_start = rc6_residency_ms();
>>>>>> nanosleep(&(struct timespec){ .tv_sec = 1, .tv_nsec = 0 },
>>>>>> NULL);
>>>>>> --
>>>>>> 2.20.1
>>>>>>
>>>>>> _______________________________________________
>>>>>> igt-dev mailing list
>>>>>> igt-dev@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/igt-dev
>>>>
>>>>
>>
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
next prev parent reply other threads:[~2020-03-04 23:29 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-03 22:38 [igt-dev] [PATCH 0/6] Add igt tests for perf OA interrupts Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 1/6] test/perf: Drop caches when closing perf stream Umesh Nerlige Ramappa
2020-03-03 22:57 ` Umesh Nerlige Ramappa
2020-03-04 8:45 ` Lionel Landwerlin
2020-03-04 17:51 ` Umesh Nerlige Ramappa
2020-03-04 22:05 ` Lionel Landwerlin
2020-03-04 23:04 ` Umesh Nerlige Ramappa
2020-03-04 23:29 ` Lionel Landwerlin [this message]
2020-03-04 23:51 ` Umesh Nerlige Ramappa
2020-03-05 19:36 ` Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 2/6] include/drm-uapi: Update i915_drm.h for perf OA APIs Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 3/6] tests/perf: new tests for parameterized OA buffer polling Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 4/6] tests/perf: new tests for OA interrupt Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 5/6] tests/perf: add flush data ioctl test Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 6/6] tools: Enable interrupt support in i915 perf recorder Umesh Nerlige Ramappa
2020-03-03 23:56 ` [igt-dev] ✓ Fi.CI.BAT: success for Add igt tests for perf OA interrupts Patchwork
2020-03-04 14:44 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e2012f7-4e18-be2d-0226-458056fcd81b@intel.com \
--to=lionel.g.landwerlin@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=umesh.nerlige.ramappa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox