From: Jonathan Kim <jonathan.kim@amd.com>
To: <amd-gfx@lists.freedesktop.org>
Cc: Felix.Kuehling@amd.com
Subject: [PATCH 25/29] drm/amdkfd: add debug query event operation
Date: Mon, 31 Oct 2022 12:23:55 -0400 [thread overview]
Message-ID: <20221031162359.445805-25-jonathan.kim@amd.com> (raw)
In-Reply-To: <20221031162359.445805-1-jonathan.kim@amd.com>
Allow the debugger to a single query queue, device and process exception
in a FIFO manner.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 64 ++++++++++++++++++++++++
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 5 ++
3 files changed, 75 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 200e11f02382..b918213a0087 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2946,6 +2946,12 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v
r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags);
break;
case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
+ r = kfd_dbg_ev_query_debug_event(target,
+ &args->query_debug_event.queue_id,
+ &args->query_debug_event.gpu_id,
+ args->query_debug_event.exception_mask,
+ &args->query_debug_event.exception_mask);
+ break;
case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 1f4d3fa0278e..6985a53b83e9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -33,6 +33,70 @@
#define MAX_WATCH_ADDRESSES 4
static DEFINE_SPINLOCK(watch_points_lock);
+int kfd_dbg_ev_query_debug_event(struct kfd_process *process,
+ unsigned int *queue_id,
+ unsigned int *gpu_id,
+ uint64_t exception_clear_mask,
+ uint64_t *event_status)
+{
+ struct process_queue_manager *pqm;
+ struct process_queue_node *pqn;
+ int i;
+
+ if (!(process && process->debug_trap_enabled))
+ return -ENODATA;
+
+ mutex_lock(&process->event_mutex);
+ *event_status = 0;
+ *queue_id = 0;
+ *gpu_id = 0;
+
+ /* find and report queue events */
+ pqm = &process->pqm;
+ list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
+ uint64_t tmp = process->exception_enable_mask;
+
+ if (!pqn->q)
+ continue;
+
+ tmp &= pqn->q->properties.exception_status;
+
+ if (!tmp)
+ continue;
+
+ *event_status = pqn->q->properties.exception_status;
+ *queue_id = pqn->q->properties.queue_id;
+ *gpu_id = pqn->q->device->id;
+ pqn->q->properties.exception_status &= ~exception_clear_mask;
+ goto out;
+ }
+
+ /* find and report device events */
+ for (i = 0; i < process->n_pdds; i++) {
+ struct kfd_process_device *pdd = process->pdds[i];
+ uint64_t tmp = process->exception_enable_mask
+ & pdd->exception_status;
+
+ if (!tmp)
+ continue;
+
+ *event_status = pdd->exception_status;
+ *gpu_id = pdd->dev->id;
+ pdd->exception_status &= ~exception_clear_mask;
+ goto out;
+ }
+
+ /* report process events */
+ if (process->exception_enable_mask & process->exception_status) {
+ *event_status = process->exception_status;
+ process->exception_status &= ~exception_clear_mask;
+ }
+
+out:
+ mutex_unlock(&process->event_mutex);
+ return *event_status ? 0 : -EAGAIN;
+}
+
void debug_event_write_work_handler(struct work_struct *work)
{
struct kfd_process *process;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index 12b80b6c96d0..c64ffd3efc46 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -27,6 +27,11 @@
void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind_count);
int kfd_dbg_trap_activate(struct kfd_process *target);
+int kfd_dbg_ev_query_debug_event(struct kfd_process *process,
+ unsigned int *queue_id,
+ unsigned int *gpu_id,
+ uint64_t exception_clear_mask,
+ uint64_t *event_status);
bool kfd_set_dbg_ev_from_interrupt(struct kfd_dev *dev,
unsigned int pasid,
uint32_t doorbell_id,
--
2.25.1
next prev parent reply other threads:[~2022-10-31 16:25 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 16:23 [PATCH 01/29] drm/amdkfd: add debug and runtime enable interface Jonathan Kim
2022-10-31 16:23 ` [PATCH 02/29] drm/amdkfd: display debug capabilities Jonathan Kim
2022-11-22 23:08 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 03/29] drm/amdkfd: prepare per-process debug enable and disable Jonathan Kim
2022-11-22 23:31 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 04/29] drm/amdgpu: add kgd hw debug mode setting interface Jonathan Kim
2022-12-01 0:08 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 05/29] drm/amdgpu: setup hw debug registers on driver initialization Jonathan Kim
2022-11-22 23:38 ` Felix Kuehling
2022-11-23 20:53 ` Kim, Jonathan
2022-12-01 0:18 ` Felix Kuehling
2022-12-01 0:23 ` Felix Kuehling
2022-12-02 17:42 ` Kim, Jonathan
2022-10-31 16:23 ` [PATCH 06/29] drm/amdgpu: add gfx9 hw debug mode enable and disable calls Jonathan Kim
2022-11-22 23:50 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 07/29] drm/amdgpu: add gfx9.4.1 " Jonathan Kim
2022-11-22 23:59 ` Felix Kuehling
2022-11-24 14:58 ` Kim, Jonathan
2022-11-24 16:25 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 08/29] drm/amdgpu: add gfx10 " Jonathan Kim
2022-10-31 16:23 ` [PATCH 09/29] drm/amdgpu: add gfx9.4.2 " Jonathan Kim
2022-10-31 16:23 ` [PATCH 10/29] drm/amdgpu: add configurable grace period for unmap queues Jonathan Kim
2022-11-23 0:21 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 11/29] drm/amdkfd: prepare map process for single process debug devices Jonathan Kim
2022-10-31 16:23 ` [PATCH 12/29] drm/amdgpu: prepare map process for multi-process " Jonathan Kim
2022-10-31 16:23 ` [PATCH 13/29] drm/amdkfd: add per process hw trap enable and disable functions Jonathan Kim
2022-10-31 16:23 ` [PATCH 14/29] drm/amdkfd: add raise exception event function Jonathan Kim
2022-10-31 16:23 ` [PATCH 15/29] drm/amdkfd: add send exception operation Jonathan Kim
2022-10-31 16:23 ` [PATCH 16/29] drm/amdkfd: add runtime enable operation Jonathan Kim
2022-11-23 0:52 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA Jonathan Kim
2022-11-23 0:44 ` Felix Kuehling
2022-11-24 14:51 ` Kim, Jonathan
2022-11-24 16:23 ` Felix Kuehling
2022-11-24 20:27 ` Kim, Jonathan
2022-11-25 16:53 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 18/29] drm/amdkfd: update process interrupt handling for debug events Jonathan Kim
2022-10-31 16:23 ` [PATCH 19/29] drm/amdkfd: add debug set exceptions enabled operation Jonathan Kim
2022-11-24 21:24 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 20/29] drm/amdkfd: add debug wave launch override operation Jonathan Kim
2022-11-29 22:37 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 21/29] drm/amdkfd: add debug wave launch mode operation Jonathan Kim
2022-12-01 0:02 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 22/29] drm/amdkfd: add debug suspend and resume process queues operation Jonathan Kim
2022-11-29 23:55 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 23/29] drm/amdkfd: add debug set and clear address watch points operation Jonathan Kim
2022-11-30 0:34 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 24/29] drm/amdkfd: add debug set flags operation Jonathan Kim
2022-11-30 0:39 ` Felix Kuehling
2022-10-31 16:23 ` Jonathan Kim [this message]
2022-11-30 0:44 ` [PATCH 25/29] drm/amdkfd: add debug query event operation Felix Kuehling
2022-10-31 16:23 ` [PATCH 26/29] drm/amdkfd: add debug query exception info operation Jonathan Kim
2022-11-30 0:50 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 27/29] drm/amdkfd: add debug queue snapshot operation Jonathan Kim
2022-11-30 23:55 ` Felix Kuehling
2022-12-02 19:13 ` Kim, Jonathan
2022-10-31 16:23 ` [PATCH 28/29] drm/amdkfd: add debug device " Jonathan Kim
2022-12-01 0:00 ` Felix Kuehling
2022-10-31 16:23 ` [PATCH 29/29] drm/amdkfd: bump kfd ioctl minor version for debug api availability Jonathan Kim
2022-12-01 0:00 ` Felix Kuehling
2022-11-22 23:05 ` [PATCH 01/29] drm/amdkfd: add debug and runtime enable interface Felix Kuehling
2022-11-23 20:45 ` Kim, Jonathan
-- strict thread matches above, loose matches on Subject: below --
2022-08-29 14:29 [PATCH 0/29] Introduce AMD GPU ISA Debugging for HSA Compute Jonathan Kim
2022-08-29 14:30 ` [PATCH 25/29] drm/amdkfd: add debug query event operation Jonathan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221031162359.445805-25-jonathan.kim@amd.com \
--to=jonathan.kim@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox