Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: imre.deak@intel.com
Cc: igt-dev@lists.freedesktop.org, Lee Shawn C <shawn.c.lee@intel.com>
Subject: Re: [igt-dev] [PATCH] tests: read engine name again before restore timeout value
Date: Thu, 12 Oct 2023 13:11:03 +0100	[thread overview]
Message-ID: <09ef7bc6-6ab5-9b05-ab6f-9a8eee083c0a@linux.intel.com> (raw)
In-Reply-To: <ZSfZb6QTiUlBAPqw@ideak-desk>


On 12/10/2023 12:33, Imre Deak wrote:
> On Thu, Oct 12, 2023 at 09:53:44AM +0100, Tvrtko Ursulin wrote:
>>
>> On 11/10/2023 09:42, Lee Shawn C wrote:
>>> We encounter a unexpected error on chrome book device while
>>> running this test. The tool will restore GPU engine's timeout
>>> value but open incorrect file name (XR24 in below). This is
>>> a workaround patch to avoid this problem before we got the
>>> root cause.
>>>
>>> openat(AT_FDCWD, "/sys/dev/char/226:0", O_RDONLY) = 12
>>> openat(12, "dev", O_RDONLY)             = 13
>>> read(13, "226:0\n", 1023)               = 6
>>> close(13)                               = 0
>>> openat(12, "engine", O_RDONLY)          = 13
>>> close(12)                               = 0
>>> openat(13, "XR24", O_RDONLY)            = -1 ENOENT (No such file or directory)
>>>
>>> Signed-off-by: Lee Shawn C <shawn.c.lee@intel.com>
>>> Issue: https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/issues/147
>>> ---
>>>    tests/intel/kms_busy.c | 10 ++++++++--
>>>    1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tests/intel/kms_busy.c b/tests/intel/kms_busy.c
>>> index 5b620658fb18..119e6f1652ce 100644
>>> --- a/tests/intel/kms_busy.c
>>> +++ b/tests/intel/kms_busy.c
>>> @@ -414,9 +414,15 @@ static void gpu_engines_init_timeouts(int fd, int max_engines,
>>>    	}
>>>    }
>>> -static void gpu_engines_restore_timeouts(int fd, int num_engines, const struct gem_engine_properties *props)
>>> +static void gpu_engines_restore_timeouts(int fd, int num_engines, struct gem_engine_properties *props)
>>>    {
>>> -	int i;
>>> +	const struct intel_execution_engine2 *e;
>>> +	int i = 0;
>>> +
>>> +	for_each_physical_engine(fd, e) {
>>> +		props[i].engine = e;
>>> +		i++;
>>> +	}
>>>    	for (i = 0; i < num_engines; i++)
>>>    		gem_engine_properties_restore(fd, &props[i]);
>>
>> By the look of it bug is in gpu_engines_init_timeouts(). This pointer
>> assignment:
>>
>> 	for_each_physical_engine(fd, e) {
>> 		igt_assert(*num_engines < max_engines);
>>
>> 		props[*num_engines].engine = e;
>>
>> ^^^ e is on stack, in scope of for_each_physical_engine, so by the time
>> gpu_engines_restore_timeouts() runs it can legitimately point to garbage,
>> like XR24 in your example.
>>
>> Your workaround works, although strictly don't think the order of engines is
>> guaranteed. Which is also moot since same preempt_timeout and
>> hearbeat_interval is used for all.
>>
>> Nevertheless, proper fix would be to allocate a make a copy of each engine
>> and store a pointer to that. It might be an overkill but, up for discussion
>> I guess.
>>
>> Fixes: 9e635a1c5029 ("tests/kms_busy: Ensure GPU reset when waiting for a
>> new FB during modeset")
>>
>> So I'll be cheeky and add Imre and Juha-Pekka too.
> 
> ugh, thanks for catching this.
> 
> Would it work to save the engine class/instance instead in
> gpu_engines_init_timeouts(), and look up the engines using these in
> gpu_engines_restore_timeouts() ?

Not sure exactly what you have in mind. Modify struct 
gem_engine_properties to not store the pointer to the engine? But 
e->name is what it needs to restore. Storing class:instance and then on 
restore iterate all engines again to find the class:instance and use the 
name from local copy? Hm yes, that would work.

Also, on a deeper look gem_exec_capture also appears has the same bug.

find_first_available_engine
   for_each_ctx_engine
     configure_hangs
       props.engine = e;

And i915_hangman AFAICT. Unless I am super confused..

I tried running it under Valgrind but it is not detecting anything which 
I guess is because it is stack and not heap.

Hm maybe more elegant is to change the struct to:

  struct gem_engine_properties {
-	const struct intel_execution_engine2 *engine;
+	const struct intel_execution_engine2 engine;
	int preempt_timeout;
	int heartbeat_interval;
  };

So instead of storing a pointer a copy is made, which will include a 
copy of the name. (Since it is embedded in struct intel_execution_engine2.)

Then places which record engines would just need to:

- saved_params[num_engines].engine = e;
+ saved_params[num_engines].engine = *e;

No further churn then, I think..

Regards,

Tvrtko

  reply	other threads:[~2023-10-12 12:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-11  8:42 [igt-dev] [PATCH] tests: read engine name again before restore timeout value Lee Shawn C
2023-10-11 16:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2023-10-11 16:44 ` [igt-dev] ✓ CI.xeBAT: " Patchwork
2023-10-12  6:38 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
2023-10-12  8:53 ` [igt-dev] [PATCH] " Tvrtko Ursulin
2023-10-12 11:33   ` Imre Deak
2023-10-12 12:11     ` Tvrtko Ursulin [this message]
2023-10-12 14:25       ` Imre Deak
2023-10-12 12:19     ` Lee, Shawn C

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09ef7bc6-6ab5-9b05-ab6f-9a8eee083c0a@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=imre.deak@intel.com \
    --cc=shawn.c.lee@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox