On 9/3/2024 4:48 PM, Andrzej Hajda wrote:


On 29.08.2024 20:00, Nirmoy Das wrote:
Tests that are causing pagefaults should wait for exec to queue
ban/finish otherwise pending engine resets because of on-going
pagefaults would cause failure in subsequent tests to fail.

Not all execs will generate page faults and in such case reading ban
property is not enough but the signal should either -EIO or 0.
so read that instead.

v2: specify timeout reason and iterate over exec_queues(Andrzej)
v3: increase timeout
v4: check for signal status to be -EIO/0.

Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
---
 tests/intel/xe_exec_fault_mode.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tests/intel/xe_exec_fault_mode.c b/tests/intel/xe_exec_fault_mode.c
index 1f1f1e50b..fa050d0dc 100644
--- a/tests/intel/xe_exec_fault_mode.c
+++ b/tests/intel/xe_exec_fault_mode.c
@@ -329,6 +329,17 @@ test_exec(int fd, struct drm_xe_engine_class_instance *eci,
 				igt_assert_eq(data[i].data, 0xc0ffee);
 	}
 
+	if ((flags & INVALID_FAULT)) {
+		for (i = 0; i < n_execs; i++) {
+			int ret;
+			int64_t timeout = NSEC_PER_SEC;
+
+			ret = __xe_wait_ufence(fd, &data[i].exec_sync, USER_FENCE_VALUE,
+					       exec_queues[i % n_exec_queues], &timeout);
+			igt_assert(ret == -EIO || ret ==0);

"ret ==0" - missing space.

I will fix it.


Btw in theory we have n_execs * 1second  (128sec???) total wait time.


We will be trouble if this ever happens for one store instruction :) I could add a 1sec wait and then run the

loop but I think that is not needed.

+		}
+	}
+

If I placed change correctly in the code it could be replaced by chain:
if ((flags & INVALID_FAULT)) {
    // your change
} else if !(flags & INVALID_VA) { ... }


That fits well, I will do that.

Up to you. Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>


Thanks,

Nirmoy

Regards Andrzej
 	for (i = 0; i < n_exec_queues; i++) {
 		xe_exec_queue_destroy(fd, exec_queues[i]);
 		if (bind_exec_queues[i])