From: Matthew Brost <matthew.brost@intel.com>
To: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Cc: <igt-dev@lists.freedesktop.org>, <stuart.summers@intel.com>,
<matthew.auld@intel.com>
Subject: Re: [PATCH i-g-t] tests/intel/xe_vm: Fix Sync Issue between Unbind and Hammer Thread
Date: Fri, 5 Apr 2024 21:56:37 +0000 [thread overview]
Message-ID: <ZhBzlaOywBje1xeU@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <f845a357442bd285ffb26cd0288a97819cd9d470.1712350676.git.jagmeet.randhawa@intel.com>
On Fri, Apr 05, 2024 at 02:06:08PM -0700, Jagmeet Randhawa wrote:
> This patch addresses a critical synchronization issue
> between the "test_munmap_style_unbind" function and
> the "hammer_thread" function. Previously, "test_munmap_style_unbind"
> would proceed with it's execution after launching
> "hammer_thread". However, the "hammer_thread" in it's
> initial iteration encountered an error during the syncobj_wait()
> call halting its execution prematurely. So we never returned
> back to the "hammer_thread" from "test_munmap_style_unbind".
>
> We resolved this error by adding a syncobj_signal() call in our
> "hammer_thread" function, allowing "hammer_thread" to send the
> signal to "test_munmap_style_unbind" therefore ensuring the
> seamless operation of both threads and correct synchronization.
>
This explaination does make sense, see below.
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Stuart Summers <stuart.summers@intel.com>
> Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
> ---
> VLK-54352 and VLK-55620
>
> tests/intel/xe_vm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tests/intel/xe_vm.c b/tests/intel/xe_vm.c
> index ecb2a783c..a25878cd8 100644
> --- a/tests/intel/xe_vm.c
> +++ b/tests/intel/xe_vm.c
> @@ -1153,6 +1153,7 @@ static void *hammer_thread(void *tdata)
> } else {
> exec.num_syncs = 1;
> err = __xe_exec(t->fd, &exec);
> + syncobj_signal(t->fd, &sync[0].handle, 1);
This doesn't look right.
This thread is doing execs as fast as possible waiting on every 32rd
exec. The main thread (test_munmap_style_unbind) is modifying the VMs
bindings in a way that creates scheduling dependencies between the
threads. The KMD is designed to enforce these scheduling dependencies
while both threads run fully async. If syncobj_wait hangs, there is
likely an KMD or hardware issues here.
This code signals the syncobj from every 32nd exec in software bypassing
the hardware / KMD signaling the sync. This breaks the design of the
tests and makes a likely KMD / hardware issue.
Do the VLK failures occur on every engine instance / class?
Matt
> igt_assert(syncobj_wait(t->fd, &sync[0].handle, 1,
> INT64_MAX, 0, NULL));
> syncobj_reset(t->fd, &sync[0].handle, 1);
> --
> 2.25.1
>
next prev parent reply other threads:[~2024-04-05 21:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-05 21:06 [PATCH i-g-t] tests/intel/xe_vm: Fix Sync Issue between Unbind and Hammer Thread Jagmeet Randhawa
2024-04-05 21:56 ` Matthew Brost [this message]
2024-04-05 21:58 ` Matthew Brost
2024-04-08 21:37 ` Randhawa, Jagmeet
2024-04-05 21:57 ` ✓ CI.xeBAT: success for " Patchwork
2024-04-05 22:15 ` ✓ Fi.CI.BAT: " Patchwork
2024-04-06 4:24 ` ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZhBzlaOywBje1xeU@DUT025-TGLU.fm.intel.com \
--to=matthew.brost@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=jagmeet.randhawa@intel.com \
--cc=matthew.auld@intel.com \
--cc=stuart.summers@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox