From: "Teres Alexis, Alan Previn" <alan.previn.teres.alexis@intel.com>
To: "Dong, Zhanjun" <zhanjun.dong@intel.com>,
"igt-dev@lists.freedesktop.org" <igt-dev@lists.freedesktop.org>
Subject: Re: [PATCH i-g-t v4 1/1] tests/intel/xe_exec_capture: Add xe_exec_capture test
Date: Wed, 13 Nov 2024 23:23:28 +0000 [thread overview]
Message-ID: <97752c08c87631560b98dda96e560bbd14f14ae0.camel@intel.com> (raw)
In-Reply-To: <aa2b9b6f3286a4d150b4bc3f2cd18a6b87a9e327.camel@intel.com>
On Tue, 2024-11-12 at 23:48 -0800, Teres Alexis, Alan Previn wrote:
> On Tue, 2024-10-22 at 09:33 -0700, Zhanjun Dong wrote:
> > Test with GuC reset, check if devcoredump register dump is within the
> > + sync[0].flags &= ~DRM_XE_SYNC_FLAG_SIGNAL;
> > + sync[1].flags |= DRM_XE_SYNC_FLAG_SIGNAL;
> > + sync[1].handle = syncobjs[e];
> > +
> > + exec.exec_queue_id = exec_queues[e];
> > + exec.address = exec_addr;
> > + if (e != i)
> > + syncobj_reset(fd, &syncobjs[e], 1);
> > + xe_exec(fd, &exec);
> > + }
> alan: so this code is new to me - so to help me understand, can you explain
> if i got this right in terms of what above loop and below sync-waits are doing?:
> 1. so e send n_execs number of batches across n_exec_queues number of queues (on the same engine)
> 2. then, below, we wait on each one of those batches to start.
> 3. And finally we wait for the vm to unbind?
>
> However, i notice from the caller that you are only doing count batch of one and queue count of one.
> That said, i am kinda wondering that if the batch buffer is doing a spinner, and if this igt
> test coule potentially be running alone without any other workloads, then how would a reset be
> triggered? i thought if we only have 1 workload running with nothing else being queued, then
> the GuC wont have a reason to preempt the work. I also thought we dont have heartbeat in xe...
> am i mistaken? how do we guarantee that an engine reset occurs?
>
> lets connect offline as i am a bit lost in some of these codes.
alan: as per offline follow up and after consulting others, we know now that the reason
this batch does actually get reset despite being the only batch of the only queue of the
only engine running at the moment (here in this line of code) for the entire card.
so its the drm subsystems scheduler that has a timeout and will reques the job to be
killed due to timeout if not done within some time. We need to follow up if there is a way
to configure what this timeout is because we dont want our test to be at the mercy of
different distro's or customer systems that may have different timeouts making our
test execution inconsistent. I assume the first set of syncobj-waits guarantee that
the task has started, thus we can even put a very short timeout like just one second
since it is after all just spinner (looping "stard batch, store-dword" go again).
> > +
> > + for (i = 0; i < n_exec_queues && n_execs; i++)
> > + igt_assert(syncobj_wait(fd, &syncobjs[i], 1, INT64_MAX, 0,
> > + NULL));
> > + igt_assert(syncobj_wait(fd, &sync[0].handle, 1, INT64_MAX, 0, NULL));
> > +
> > + sync[0].flags |= DRM_XE_SYNC_FLAG_SIGNAL;
> > + xe_vm_unbind_async(fd, vm, 0, 0, addr, bo_size, sync, 1);
> > + igt_assert(syncobj_wait(fd, &sync[0].handle, 1, INT64_MAX, 0, NULL));
> > +
> > + syncobj_destroy(fd, sync[0].handle);
> > + for (i = 0; i < n_exec_queues; i++) {
> > + syncobj_destroy(fd, syncobjs[i]);
> > + xe_exec_queue_destroy(fd, exec_queues[i]);
> > + }
> > +
> > + munmap(data, bo_size);
> > + gem_close(fd, bo);
> > + xe_vm_destroy(fd, vm);
> > +}
> > +
next prev parent reply other threads:[~2024-11-13 23:23 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-22 16:33 [PATCH i-g-t v4 0/1] tests/intel/xe_exec_capture: Add xe_exec_capture test Zhanjun Dong
2024-10-22 16:33 ` [PATCH i-g-t v4 1/1] " Zhanjun Dong
2024-11-13 7:48 ` Teres Alexis, Alan Previn
2024-11-13 23:23 ` Teres Alexis, Alan Previn [this message]
2024-11-14 22:23 ` Dong, Zhanjun
2024-11-15 19:21 ` Dong, Zhanjun
2024-11-14 22:33 ` Dong, Zhanjun
2024-11-13 23:26 ` Teres Alexis, Alan Previn
2024-10-22 18:47 ` ✗ GitLab.Pipeline: warning for tests/intel/xe_exec_capture: Add xe_exec_capture test (rev2) Patchwork
2024-10-22 19:14 ` ✓ Fi.CI.BAT: success " Patchwork
2024-10-22 20:19 ` ✓ CI.xeBAT: " Patchwork
2024-10-23 2:05 ` ✗ CI.xeFULL: failure " Patchwork
2024-10-23 3:35 ` ✗ Fi.CI.IGT: " Patchwork
2024-10-23 5:18 ` [PATCH i-g-t v4 0/1] tests/intel/xe_exec_capture: Add xe_exec_capture test Peter Senna Tschudin
2024-10-23 14:13 ` Dong, Zhanjun
2024-10-24 4:43 ` Peter Senna Tschudin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=97752c08c87631560b98dda96e560bbd14f14ae0.camel@intel.com \
--to=alan.previn.teres.alexis@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=zhanjun.dong@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox