From: "Teres Alexis, Alan Previn" <alan.previn.teres.alexis@intel.com>
To: "Dong, Zhanjun" <zhanjun.dong@intel.com>,
"igt-dev@lists.freedesktop.org" <igt-dev@lists.freedesktop.org>
Cc: "kamil.konieczny@linux.intel.com" <kamil.konieczny@linux.intel.com>
Subject: Re: [PATCH v9] tests/intel/xe_exec_capture: Add xe_exec_capture test
Date: Thu, 12 Dec 2024 19:54:58 +0000 [thread overview]
Message-ID: <f48def4f702c6353e4cfd7012a7e606aaaea4762.camel@intel.com> (raw)
In-Reply-To: <c4b8b102d50c8cb81524c8d2b5335201c4bc4bb8.camel@intel.com>
Zhanjun, per offline chats with Kamil looks like we need to expand the
igt_fixture sections before and after the igt_subtest section and
save the per-engine-timeouts in the initial fixture and restore
the per-engine-timeouts in the later fixture because the fixture
section is not bypassed during an assert. That's what i understood.
That said, we will need another rev of this.
On Wed, 2024-12-11 at 14:08 -0800, Teres Alexis, Alan Previn wrote:
> Just re-RB-ing after the recent addition for the change to set engine execution time manually before running the tests
> on each engine in order to limit the execution time of this test:
>
> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
>
>
> On Fri, 2024-12-06 at 14:59 -0800, Dong, Zhanjun wrote:
> > Submit cmds to the GPU that result in a GuC engine reset and check that
> > devcoredump register dump is generated, by the GuC, and includes the
> > full register range.
> >
> > Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
> > Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
> > Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
> > ---
> > Changes from prior revs:
> > v9:- Reduced job timeout to 2 seconds to speedup test
> > Add info print to show test is running on single/multiple GPU
> > v8:- Move change list below ---
>
>
> alan: I just reviewed the difference of the last two revs (diff of diff
> farther below):
> with that change, we hope it will address Kamil's concern by reducing the execution
> time dramatically. IIRC Zhanjun couldn't designate any subtest to declare pass
> or fail without ensuring multiple engines are executed-on back to back since the
> test needs to ensure that XE-KMD is catching the correct guc-error-dump for the
> exact batch on the exact engine we expect it to capture amidst multiple back to back
> runs of different-batches-same-engine vs different-engines. (the test uses the ring
> buffer batch buffer address as a way to differentiate and determine precisely).
>
>
> 28a29
> > +#include "igt_sysfs.h"
> 37a39,40
> > +#define CAPTURE_JOB_TIMEOUT 2000
> > +#define JOB_TIMOUT_ENTRY "job_timeout_ms"
> 83a87,109
> > +static u64
> > +xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
> > +{
> > + int engine_fd = -1;
> > + u64 ret;
> > +
> > + engine_fd = xe_sysfs_engine_open(fd, eci->gt_id, eci->engine_class);
> > + ret = igt_sysfs_get_u64(engine_fd, JOB_TIMOUT_ENTRY);
> > + close(engine_fd);
> > +
> > + return ret;
> > +}
> > +
> > +static void xe_sysfs_set_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci,
> > + u64 timeout)
> > +{
> > + int engine_fd = -1;
> > +
> > + engine_fd = xe_sysfs_engine_open(fd, eci->gt_id, eci->engine_class);
> > + igt_sysfs_set_u64(engine_fd, JOB_TIMOUT_ENTRY, CAPTURE_JOB_TIMEOUT);
> > + close(engine_fd);
> > +}
> > +
>
> ...
>
> > xe_for_each_engine(fd, hwe) {
> > /*
> > * To test devcoredump register data, the test batch address is
> > * used to compare with the dump, address bit 40 to 46 act as
> > * context id, which start with an random number, increased 1
> > * per engine. By this way, the address is unique for each
> > * engine, and start with an random number on each run.
> > */
> > const u64 addr = BASE_ADDRESS | ((u64)(engine_cid++ % CID_ADDRESS_MASK) <<
> > ADDRESS_SHIFT);
> 413a440
> > + u64 job_timeout = xe_sysfs_get_job_timeout_ms(fd, hwe);
> 417a445,447
> > + /* Reduce timeout value to speedup test */
> > + xe_sysfs_set_job_timeout_ms(fd, hwe, CAPTURE_JOB_TIMEOUT);
> > +
> 419a450,452
> > + /* Restore timeout value */
> > + xe_sysfs_set_job_timeout_ms(fd, hwe, job_timeout);
> > +
> 460a494,495
> > + igt_info("Running test on multiple GPU\n");
> > +
> 473a509
> > + igt_info("Running test on single GPU\n");
>
next prev parent reply other threads:[~2024-12-12 19:55 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-06 22:59 [PATCH v9] tests/intel/xe_exec_capture: Add xe_exec_capture test Zhanjun Dong
2024-12-06 23:38 ` ✓ Xe.CI.BAT: success for tests/intel/xe_exec_capture: Add xe_exec_capture test (rev8) Patchwork
2024-12-06 23:52 ` ✓ i915.CI.BAT: " Patchwork
2024-12-07 1:11 ` ✗ i915.CI.Full: failure " Patchwork
2024-12-07 4:42 ` ✗ Xe.CI.Full: " Patchwork
2024-12-11 22:08 ` [PATCH v9] tests/intel/xe_exec_capture: Add xe_exec_capture test Teres Alexis, Alan Previn
2024-12-12 19:54 ` Teres Alexis, Alan Previn [this message]
2025-01-07 15:41 ` Dong, Zhanjun
2024-12-12 21:41 ` Teres Alexis, Alan Previn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f48def4f702c6353e4cfd7012a7e606aaaea4762.camel@intel.com \
--to=alan.previn.teres.alexis@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=kamil.konieczny@linux.intel.com \
--cc=zhanjun.dong@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox