From: "Maíra Canal" <mcanal@igalia.com>
To: Melissa Wen <mwen@igalia.com>, Iago Toral <itoral@igalia.com>,
Jose Maria Casanova Crespo <jmcasanova@igalia.com>,
Krzysztof Kozlowski <krzk+dt@kernel.org>,
Conor Dooley <conor+dt@kernel.org>,
Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: "Phil Elwell" <phil@raspberrypi.com>,
dri-devel@lists.freedesktop.org, devicetree@vger.kernel.org,
kernel-dev@igalia.com, stable@vger.kernel.org,
"Maíra Canal" <mcanal@igalia.com>
Subject: [PATCH v3 1/7] drm/v3d: Don't run jobs that have errors flagged in its fence
Date: Tue, 11 Mar 2025 15:13:43 -0300 [thread overview]
Message-ID: <20250311-v3d-gpu-reset-fixes-v3-1-64f7a4247ec0@igalia.com> (raw)
In-Reply-To: <20250311-v3d-gpu-reset-fixes-v3-0-64f7a4247ec0@igalia.com>
The V3D driver still relies on `drm_sched_increase_karma()` and
`drm_sched_resubmit_jobs()` for resubmissions when a timeout occurs.
The function `drm_sched_increase_karma()` marks the job as guilty, while
`drm_sched_resubmit_jobs()` sets an error (-ECANCELED) in the DMA fence of
that guilty job.
Because of this, we must check whether the job’s DMA fence has been
flagged with an error before executing the job. Otherwise, the same guilty
job may be resubmitted indefinitely, causing repeated GPU resets.
This patch adds a check for an error on the job's fence to prevent running
a guilty job that was previously flagged when the GPU timed out.
Note that the CPU and CACHE_CLEAN queues do not require this check, as
their jobs are executed synchronously once the DRM scheduler starts them.
Cc: stable@vger.kernel.org
Fixes: d223f98f0209 ("drm/v3d: Add support for compute shader dispatch.")
Fixes: 1584f16ca96e ("drm/v3d: Add support for submitting jobs to the TFU.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_sched.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 80466ce8c7df669280e556c0793490b79e75d2c7..c2010ecdb08f4ba3b54f7783ed33901552d0eba1 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -327,11 +327,15 @@ v3d_tfu_job_run(struct drm_sched_job *sched_job)
struct drm_device *dev = &v3d->drm;
struct dma_fence *fence;
+ if (unlikely(job->base.base.s_fence->finished.error))
+ return NULL;
+
+ v3d->tfu_job = job;
+
fence = v3d_fence_create(v3d, V3D_TFU);
if (IS_ERR(fence))
return NULL;
- v3d->tfu_job = job;
if (job->base.irq_fence)
dma_fence_put(job->base.irq_fence);
job->base.irq_fence = dma_fence_get(fence);
@@ -369,6 +373,9 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
struct dma_fence *fence;
int i, csd_cfg0_reg;
+ if (unlikely(job->base.base.s_fence->finished.error))
+ return NULL;
+
v3d->csd_job = job;
v3d_invalidate_caches(v3d);
--
Git-154)
next prev parent reply other threads:[~2025-03-11 18:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-11 18:13 [PATCH v3 0/7] drm/v3d: Fix GPU reset issues on the Raspberry Pi 5 Maíra Canal
2025-03-11 18:13 ` Maíra Canal [this message]
2025-03-11 18:13 ` [PATCH v3 2/7] drm/v3d: Set job pointer to NULL when the job's fence has an error Maíra Canal
2025-03-11 18:13 ` [PATCH v3 3/7] drm/v3d: Associate a V3D tech revision to all supported devices Maíra Canal
2025-03-11 18:13 ` [PATCH v3 4/7] dt-bindings: gpu: v3d: Add per-compatible register restrictions Maíra Canal
2025-03-11 18:13 ` [PATCH v3 5/7] dt-bindings: gpu: v3d: Add SMS register to BCM2712 compatible Maíra Canal
2025-03-11 20:23 ` Rob Herring
2025-03-11 22:05 ` Maíra Canal
2025-03-12 9:06 ` Krzysztof Kozlowski
2025-03-12 17:47 ` Maíra Canal
2025-03-11 18:13 ` [PATCH v3 6/7] drm/v3d: Use V3D_SMS registers for power on/off and reset on V3D 7.x Maíra Canal
2025-03-11 18:13 ` [PATCH v3 7/7] dt-bindings: gpu: Add V3D driver maintainer as DT maintainer Maíra Canal
2025-03-12 9:34 ` [PATCH v3 0/7] drm/v3d: Fix GPU reset issues on the Raspberry Pi 5 Raag Jadav
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250311-v3d-gpu-reset-fixes-v3-1-64f7a4247ec0@igalia.com \
--to=mcanal@igalia.com \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=itoral@igalia.com \
--cc=jmcasanova@igalia.com \
--cc=kernel-dev@igalia.com \
--cc=krzk+dt@kernel.org \
--cc=mwen@igalia.com \
--cc=nsaenz@kernel.org \
--cc=phil@raspberrypi.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).