From: Rob Clark <robdclark@gmail.com>
To: dri-devel@lists.freedesktop.org
Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org,
Connor Abbott <cwabbott0@gmail.com>,
Rob Clark <robdclark@chromium.org>,
Rob Clark <robdclark@gmail.com>,
Abhinav Kumar <quic_abhinavk@quicinc.com>,
Dmitry Baryshkov <lumag@kernel.org>, Sean Paul <sean@poorly.run>,
Marijn Suijten <marijn.suijten@somainline.org>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Konrad Dybcio <konradybcio@kernel.org>,
linux-kernel@vger.kernel.org (open list)
Subject: [PATCH v4 23/40] drm/msm: Mark VM as unusable on GPU hangs
Date: Wed, 14 May 2025 10:53:37 -0700 [thread overview]
Message-ID: <20250514175527.42488-24-robdclark@gmail.com> (raw)
In-Reply-To: <20250514175527.42488-1-robdclark@gmail.com>
From: Rob Clark <robdclark@chromium.org>
If userspace has opted-in to VM_BIND, then GPU hangs and VM_BIND errors
will mark the VM as unusable.
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/gpu/drm/msm/msm_gem.h | 17 +++++++++++++++++
drivers/gpu/drm/msm/msm_gem_submit.c | 3 +++
drivers/gpu/drm/msm/msm_gpu.c | 16 ++++++++++++++--
3 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index da8f92911b7b..67f845213810 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -76,6 +76,23 @@ struct msm_gem_vm {
/** @managed: is this a kernel managed VM? */
bool managed;
+
+ /**
+ * @unusable: True if the VM has turned unusable because something
+ * bad happened during an asynchronous request.
+ *
+ * We don't try to recover from such failures, because this implies
+ * informing userspace about the specific operation that failed, and
+ * hoping the userspace driver can replay things from there. This all
+ * sounds very complicated for little gain.
+ *
+ * Instead, we should just flag the VM as unusable, and fail any
+ * further request targeting this VM.
+ *
+ * As an analogy, this would be mapped to a VK_ERROR_DEVICE_LOST
+ * situation, where the logical device needs to be re-created.
+ */
+ bool unusable;
};
#define to_msm_vm(x) container_of(x, struct msm_gem_vm, base)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 7a9bd20363dd..f282d691087f 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -676,6 +676,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
if (args->pad)
return -EINVAL;
+ if (to_msm_vm(ctx->vm)->unusable)
+ return UERR(EPIPE, dev, "context is unusable");
+
/* for now, we just have 3d pipe.. eventually this would need to
* be more clever to dispatch to appropriate gpu module:
*/
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 0314e15d04c2..6503ce655b10 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -386,8 +386,20 @@ static void recover_worker(struct kthread_work *work)
/* Increment the fault counts */
submit->queue->faults++;
- if (submit->vm)
- to_msm_vm(submit->vm)->faults++;
+ if (submit->vm) {
+ struct msm_gem_vm *vm = to_msm_vm(submit->vm);
+
+ vm->faults++;
+
+ /*
+ * If userspace has opted-in to VM_BIND (and therefore userspace
+ * management of the VM), faults mark the VM as unusuable. This
+ * matches vulkan expectations (vulkan is the main target for
+ * VM_BIND)
+ */
+ if (!vm->managed)
+ vm->unusable = true;
+ }
get_comm_cmdline(submit, &comm, &cmd);
--
2.49.0
next prev parent reply other threads:[~2025-05-14 17:57 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-14 17:53 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
2025-05-14 17:53 ` [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path Rob Clark
2025-05-15 8:54 ` Danilo Krummrich
2025-05-15 9:06 ` Danilo Krummrich
2025-05-15 17:35 ` Rob Clark
2025-05-15 17:55 ` Danilo Krummrich
2025-05-15 21:57 ` Rob Clark
2025-05-16 9:01 ` Danilo Krummrich
2025-05-16 16:20 ` Rob Clark
2025-05-20 21:25 ` Dave Airlie
2025-05-20 21:52 ` Rob Clark
2025-05-20 22:31 ` Dave Airlie
2025-05-20 22:56 ` Rob Clark
2025-05-23 2:51 ` Rob Clark
2025-05-23 6:28 ` Danilo Krummrich
2025-05-14 17:53 ` [PATCH v4 02/40] drm/gpuvm: Allow VAs to hold soft reference to BOs Rob Clark
2025-05-15 9:00 ` Danilo Krummrich
2025-05-15 14:59 ` Rob Clark
2025-05-15 15:30 ` Danilo Krummrich
2025-05-15 17:34 ` Rob Clark
2025-05-15 17:51 ` Danilo Krummrich
2025-05-15 20:10 ` Rob Clark
2025-05-14 17:53 ` [PATCH v4 03/40] drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan() Rob Clark
2025-05-14 17:53 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark
2025-05-14 17:53 ` [PATCH v4 05/40] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() Rob Clark
2025-05-15 14:33 ` Will Deacon
2025-05-15 14:48 ` Rob Clark
2025-05-20 11:31 ` Will Deacon
2025-05-20 13:06 ` Robin Murphy
2025-05-20 14:06 ` Will Deacon
2025-05-14 17:53 ` [PATCH v4 06/40] drm/msm: Rename msm_file_private -> msm_context Rob Clark
2025-05-14 17:53 ` [PATCH v4 07/40] drm/msm: Improve msm_context comments Rob Clark
2025-05-14 17:53 ` [PATCH v4 08/40] drm/msm: Rename msm_gem_address_space -> msm_gem_vm Rob Clark
2025-05-14 17:53 ` [PATCH v4 09/40] drm/msm: Remove vram carveout support Rob Clark
2025-05-14 17:53 ` [PATCH v4 10/40] drm/msm: Collapse vma allocation and initialization Rob Clark
2025-05-14 17:53 ` [PATCH v4 11/40] drm/msm: Collapse vma close and delete Rob Clark
2025-05-14 17:53 ` [PATCH v4 12/40] drm/msm: Don't close VMAs on purge Rob Clark
2025-05-14 17:53 ` [PATCH v4 13/40] drm/msm: drm_gpuvm conversion Rob Clark
2025-05-14 17:53 ` [PATCH v4 14/40] drm/msm: Convert vm locking Rob Clark
2025-05-14 17:53 ` [PATCH v4 15/40] drm/msm: Use drm_gpuvm types more Rob Clark
2025-05-14 17:53 ` [PATCH v4 16/40] drm/msm: Split out helper to get iommu prot flags Rob Clark
2025-05-14 17:53 ` [PATCH v4 17/40] drm/msm: Add mmu support for non-zero offset Rob Clark
2025-05-14 17:53 ` [PATCH v4 18/40] drm/msm: Add PRR support Rob Clark
2025-05-14 17:53 ` [PATCH v4 19/40] drm/msm: Rename msm_gem_vma_purge() -> _unmap() Rob Clark
2025-05-14 17:53 ` [PATCH v4 20/40] drm/msm: Drop queued submits on lastclose() Rob Clark
2025-05-14 17:53 ` [PATCH v4 21/40] drm/msm: Lazily create context VM Rob Clark
2025-05-14 17:53 ` [PATCH v4 22/40] drm/msm: Add opt-in for VM_BIND Rob Clark
2025-05-14 17:53 ` Rob Clark [this message]
2025-05-14 17:53 ` [PATCH v4 24/40] drm/msm: Add _NO_SHARE flag Rob Clark
2025-05-14 17:53 ` [PATCH v4 25/40] drm/msm: Crashdump prep for sparse mappings Rob Clark
2025-05-14 17:53 ` [PATCH v4 26/40] drm/msm: rd dumping " Rob Clark
2025-05-14 17:53 ` [PATCH v4 27/40] drm/msm: Crashdec support for sparse Rob Clark
2025-05-14 17:53 ` [PATCH v4 28/40] drm/msm: rd dumping " Rob Clark
2025-05-14 17:53 ` [PATCH v4 29/40] drm/msm: Extract out syncobj helpers Rob Clark
2025-05-14 17:53 ` [PATCH v4 30/40] drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL Rob Clark
2025-05-14 17:53 ` [PATCH v4 31/40] drm/msm: Add VM_BIND submitqueue Rob Clark
2025-05-14 17:53 ` [PATCH v4 32/40] drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON Rob Clark
2025-05-14 17:53 ` [PATCH v4 33/40] drm/msm: Support pgtable preallocation Rob Clark
2025-05-14 17:53 ` [PATCH v4 34/40] drm/msm: Split out map/unmap ops Rob Clark
2025-05-14 17:53 ` [PATCH v4 35/40] drm/msm: Add VM_BIND ioctl Rob Clark
2025-05-14 17:53 ` [PATCH v4 36/40] drm/msm: Add VM logging for VM_BIND updates Rob Clark
2025-05-14 17:53 ` [PATCH v4 37/40] drm/msm: Add VMA unmap reason Rob Clark
2025-05-14 17:53 ` [PATCH v4 38/40] drm/msm: Add mmu prealloc tracepoint Rob Clark
2025-05-14 17:53 ` [PATCH v4 39/40] drm/msm: use trylock for debugfs Rob Clark
2025-05-14 17:53 ` [PATCH v4 40/40] drm/msm: Bump UAPI version Rob Clark
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250514175527.42488-24-robdclark@gmail.com \
--to=robdclark@gmail.com \
--cc=airlied@gmail.com \
--cc=cwabbott0@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=konradybcio@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lumag@kernel.org \
--cc=marijn.suijten@somainline.org \
--cc=quic_abhinavk@quicinc.com \
--cc=robdclark@chromium.org \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox