On Mon, Oct 21, 2024 at 11:11:46PM +0200, Nirmoy Das wrote:In case of parallel submissions multiple GuC id will point to the same exec queue and on GT reset such exec queues will get restarted multiple times which is not desirable. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2295 Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> --- drivers/gpu/drm/xe/xe_guc_submit.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 0b81972ff651..6aeb007eaf06 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1784,8 +1784,13 @@ int xe_guc_submit_start(struct xe_guc *guc) mutex_lock(&guc->submission_state.lock); atomic_dec(&guc->submission_state.stopped); - xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) + xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) { + /* Skip restarting parallel queues */ + if (exec_queue_enabled(q) && xe_exec_queue_is_parallel(q)) + continue;This doesn't look right as exec_queue_enabled can race here...
Ah right, just realized this happens async with the run_job
I think this should be... if (q->guc->id != index) continue;
This looks much better. I will try it out and resend.
This way we only call guc_exec_queue_start once per queue parallel exec queue. Also I think we need to add the same check to xe_guc_submit_stop.
I will resend with updated xe_guc_submit_stop()
thanks,
Nirmoy
Matt+ guc_exec_queue_start(q); + } mutex_unlock(&guc->submission_state.lock); wake_up_all(&guc->ct.wq); -- 2.46.0