Comments on fails:

within `xe_guc_sched_done_handler()` -> `handle_sched_done()` -> `deregister_exec_queue()`, the assert got triggered:

xe_gt_assert(guc_to_gt(guc), exec_queue_destroyed(q));

This means we have bugs in our queue life cycle. We assumed there was already a call to `set_exec_queue_destroyed()` before the `deregister_exec_queue()`, but that was not the case here.

Someone working on submission should take a look, but  - this is not related to the series under test, nor with SRIOV.


The jobs executed during the test were unable to finish in 5 seconds, so caused engine reset. This is how this test normally works - in the "PASS" example it does the same thing.

However, for some reason, in the new run IGT considered the execution too long and failed the case.

This looks to me like a race, with both KMD and IGT having roughly the same timeout set. If KMD takes longer in a specific run, the comparison on IGT side - and therefore the test - will fail.

Looks like a test issue to me. The test author should take a look. Regardless, this has nothing to do with the series under test, nor with SRIOV.


The machine did not returned after RC6. It's very hard to tell anything more. Very likely a hardware issue, but who knows.

One is certain, this has nothing to do with the series under test, nor with SRIOV.


-Tomasz


On 08.10.2024 11:20, Patchwork wrote:
Project List - Patchwork Patch Details
Series: drm/xe/vf: Post-migration recovery worker basis (rev4)
URL: https://patchwork.freedesktop.org/series/138935/
State: failure
Details: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-138935v4/index.html

CI Bug Log - changes from xe-2019-d6a4624817a46144a4dbc125c9371439edc82295_full -> xe-pw-138935v4_full

Summary

FAILURE

Serious unknown changes coming with xe-pw-138935v4_full absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in xe-pw-138935v4_full, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
to document this new failure mode, which will reduce false positives in CI.

Participating hosts (4 -> 4)

No changes in participating hosts

Possible new issues

Here are the unknown changes that may have been introduced in xe-pw-138935v4_full:

IGT changes

Possible regressions

Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

Known issues

Here are the changes found in xe-pw-138935v4_full that come from known issues:

IGT changes

Issues hit

Possible fixes

Warnings

{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).

Build changes

IGT_8054: 3f627b7fd48c6ab324ceaa80dd8cf0131292bf63 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
xe-2019-d6a4624817a46144a4dbc125c9371439edc82295: d6a4624817a46144a4dbc125c9371439edc82295
xe-pw-138935v4: 138935v4